Overcoming catastrophic forgetting in neural networks J Kirkpatrick, R Pascanu, N Rabinowitz, J Veness, G Desjardins, AA Rusu, ... Proceedings of the national academy of sciences 114 (13), 3521-3526, 2017 | 7971 | 2017 |

Progressive neural networks AA Rusu, NC Rabinowitz, G Desjardins, H Soyer, J Kirkpatrick, ... arXiv preprint arXiv:1606.04671, 2016 | 3082 | 2016 |

Theano: a CPU and GPU math expression compiler J Bergstra, O Breuleux, F Bastien, P Lamblin, R Pascanu, G Desjardins, ... Proceedings of the Python for scientific computing conference (SciPy) 4 (3), 1-7, 2010 | 2028 | 2010 |

Gemini: a family of highly capable multimodal models G Team, R Anil, S Borgeaud, JB Alayrac, J Yu, R Soricut, J Schalkwyk, ... arXiv preprint arXiv:2312.11805, 2023 | 1976 | 2023 |

Understanding disentangling in -VAE CP Burgess, I Higgins, A Pal, L Matthey, N Watters, G Desjardins, ... arXiv preprint arXiv:1804.03599, 2018 | 1232 | 2018 |

Theano: A Python framework for fast computation of mathematical expressions R Al-Rfou, G Alain, A Almahairi, C Angermueller, D Bahdanau, N Ballas, ... arXiv e-prints, arXiv: 1605.02688, 2016 | 930 | 2016 |

Theano: A CPU and GPU Math Compiler in Python. J Bergstra, O Breuleux, F Bastien, P Lamblin, R Pascanu, G Desjardins, ... SciPy, 18-24, 2010 | 866 | 2010 |

Policy distillation AA Rusu, SG Colmenarejo, C Gulcehre, G Desjardins, J Kirkpatrick, ... arXiv preprint arXiv:1511.06295, 2015 | 826 | 2015 |

Combining modality specific deep neural networks for emotion recognition in video SE Kahou, C Pal, X Bouthillier, P Froumenty, Ç Gülçehre, R Memisevic, ... Proceedings of the 15th ACM on International conference on multimodal …, 2013 | 441 | 2013 |

Theano: Deep learning on gpus with python J Bergstra, F Bastien, O Breuleux, P Lamblin, R Pascanu, O Delalleau, ... NIPS 2011, BigLearning Workshop, Granada, Spain 3 (0), 2011 | 369 | 2011 |

Unsupervised and transfer learning challenge: a deep learning approach G Mesnil, Y Dauphin, X Glorot, S Rifai, Y Bengio, I Goodfellow, E Lavoie, ... Proceedings of ICML Workshop on Unsupervised and Transfer Learning, 97-110, 2012 | 293 | 2012 |

Natural neural networks G Desjardins, K Simonyan, R Pascanu Advances in neural information processing systems 28, 2015 | 232 | 2015 |

Theano: A Python framework for fast computation of mathematical expressions TTD Team, R Al-Rfou, G Alain, A Almahairi, C Angermueller, D Bahdanau, ... arXiv preprint arXiv:1605.02688, 2016 | 215 | 2016 |

Tempered Markov chain Monte Carlo for training of restricted Boltzmann machines G Desjardins, A Courville, Y Bengio, P Vincent, O Delalleau Proceedings of the thirteenth international conference on artificial …, 2010 | 153 | 2010 |

Disentangling factors of variation via generative entangling G Desjardins, A Courville, Y Bengio arXiv preprint arXiv:1210.5474, 2012 | 129 | 2012 |

Parallel tempering for training of restricted Boltzmann machines G Desjardins, A Courville, Y Bengio, P Vincent, O Delalleau Proceedings of the thirteenth international conference on artificial …, 2010 | 124 | 2010 |

Information asymmetry in KL-regularized RL A Galashov, SM Jayakumar, L Hasenclever, D Tirumala, J Schwarz, ... arXiv preprint arXiv:1905.01240, 2019 | 109 | 2019 |

Quadratic polynomials learn better image features J Bergstra, G Desjardins, P Lamblin, Y Bengio Technical report, 1337, 2009 | 93 | 2009 |

Progressive neural networks. arXiv 2016 AA Rusu, NC Rabinowitz, G Desjardins, H Soyer, J Kirkpatrick, ... arXiv preprint arXiv:1606.04671, 2016 | 79 | 2016 |

Reward is enough for convex mdps T Zahavy, B O'Donoghue, G Desjardins, S Singh Advances in Neural Information Processing Systems 34, 25746-25759, 2021 | 74 | 2021 |