Follow
Mohammad Gheshlaghi Azar
Mohammad Gheshlaghi Azar
Cohere
Verified email at cohere.com - Homepage
Title
Cited by
Cited by
Year
Bootstrap your own latent-a new approach to self-supervised learning
JB Grill, F Strub, F Altché, C Tallec, P Richemond, E Buchatskaya, ...
Advances in neural information processing systems 33, 21271-21284, 2020
70562020
Rainbow: Combining improvements in deep reinforcement learning
M Hessel, J Modayil, H Van Hasselt, T Schaul, G Ostrovski, W Dabney, ...
Proceedings of the AAAI conference on artificial intelligence 32 (1), 2018
28602018
Minimax regret bounds for reinforcement learning
MG Azar, I Osband, R Munos
International conference on machine learning, 263-272, 2017
8882017
koray kavukcuoglu, Remi Munos, and Michal Valko. Bootstrap your own latent-a new approach to self-supervised learning
JB Grill, F Strub, F Altché, C Tallec, P Richemond, E Buchatskaya, ...
Advances in neural information processing systems 33, 21271-21284, 2020
5132020
Large-scale representation learning on graphs via bootstrapping
S Thakoor, C Tallec, MG Azar, M Azabou, EL Dyer, R Munos, P Veličković, ...
arXiv preprint arXiv:2102.06514, 2021
479*2021
Minimax PAC bounds on the sample complexity of reinforcement learning with a generative model
M Gheshlaghi Azar, R Munos, HJ Kappen
Machine learning 91, 325-349, 2013
3192013
A general theoretical paradigm to understand learning from human preferences
MG Azar, ZD Guo, B Piot, R Munos, M Rowland, M Valko, D Calandriello
International Conference on Artificial Intelligence and Statistics, 4447-4455, 2024
3102024
Speedy Q-Learning
MG Azar, M Ghavamzadeh, HJ Kappen, R Munos
Advances in Neural Information Processing Systems, 2411-2419, 2011
219*2011
The reactor: A fast and sample-efficient actor-critic agent for reinforcement learning
A Gruslys, W Dabney, MG Azar, B Piot, M Bellemare, R Munos
arXiv preprint arXiv:1704.04651, 2017
183*2017
Bootstrap latent-predictive representations for multitask reinforcement learning
ZD Guo, BA Pires, B Piot, JB Grill, F Altché, R Munos, MG Azar
International Conference on Machine Learning, 3875-3886, 2020
1632020
Dynamic Policy Programming
M Gheshlaghi Azar, V Gomez, HJ Kappen
Journal of Machine Learning Research 13, 3207-3245, 2012
1562012
Observe and look further: Achieving consistent performance on atari
T Pohlen, B Piot, T Hester, MG Azar, D Horgan, D Budden, G Barth-Maron, ...
arXiv preprint arXiv:1805.11593, 2018
1432018
Sequential transfer in multi-armed bandit with finite set of models
MG Azar, A Lazaric, E Brunskill
Advances in Neural Information Processing Systems, 2220-2228, 2013
1242013
On the sample complexity of reinforcement learning with a generative model
MG Azar, R Munos, B Kappen
arXiv preprint arXiv:1206.6461, 2012
1192012
Hindsight credit assignment
A Harutyunyan, W Dabney, T Mesnard, M Gheshlaghi Azar, B Piot, ...
Advances in neural information processing systems 32, 2019
1012019
Meta-learning of sequential strategies
PA Ortega, JX Wang, M Rowland, T Genewein, Z Kurth-Nelson, ...
arXiv preprint arXiv:1905.03030, 2019
992019
Neural predictive belief representations
ZD Guo, MG Azar, B Piot, BA Pires, R Munos
arXiv preprint arXiv:1811.06407, 2018
932018
Nash learning from human feedback
R Munos, M Valko, D Calandriello, MG Azar, M Rowland, ZD Guo, Y Tang, ...
arXiv preprint arXiv:2312.00886, 2023
812023
Byol-explore: Exploration by bootstrapped prediction
Z Guo, S Thakoor, M Pîslar, B Avila Pires, F Altché, C Tallec, A Saade, ...
Advances in neural information processing systems 35, 31855-31870, 2022
702022
Stochastic optimization of a locally smooth function under correlated bandit feedback
MG Azar, A Lazaric, E Brunskill
31st International Conference on Machine Learning (ICML), 2014
70*2014
The system can't perform the operation now. Try again later.
Articles 1–20