Bio
Hi there! I am a Research Scientist at Google DeepMind. I am doing research on Reinforcement Learning and LLMs and also contributing to large-scale efforts including Bard and Gemini. These days I am mostly implementing cool research ideas as part of the Gemma post-training team.
Prior to that, I did my PhD at Google Brain and Inria Lille (Scool team, ex-SequeL). I worked on Reinforcement Learning, with a focus on credit assignment and interpretability. My advisors were Olivier Pietquin and Philippe Preux. I also collaborated with Matthieu Geist.
My PhD thesis is available for consultation (manuscript, slides).
Prior to my PhD, I got an engineering degree in Computer Science and Applied Mathematics from Télécom Paris, and an M.Sc. in Machine Learning from École Polytechnique.
Outside of work, I make generative art. I also love music, roguelikes, high-intensity interval training and spicy food.
My CV is available online, and all my publications can be consulted here.
Publications
On Teacher Hacking in Language Model Distillation 
Daniil Tiapkin, Daniele Calandriello, Johan Ferret, Sarah Perrin, Nino Vieillard, Alexandre Ramé, Mathieu Blondel 
ICML 2025 
[ paper ]
BOND: Aligning LLMs with Best-of-N Distillation 
Pier Giuseppe Sessa, Robert Dadashi, Léonard Hussenot, Johan Ferret, Nino Vieillard, Alexandre Ramé, Bobak Shariari, Sarah Perrin, Abe Friesen, Geoffrey Cideron, Sertan Girgin, Piotr Stanczyk, Andrea Michi, Danila Sinopalnikov, Sabela Ramos, Amélie Héliou, Aliaksei Severyn, Matt Hoffman, Nikola Momchev, Olivier Bachem 
ICLR 2025 
[ paper ]
Diversity-Rewarded Classifier-Free Guidance Distillation 
Geoffrey Cideron, Andrea Agostinelli, Johan Ferret, Sertan Girgin, Romuald Elie, Olivier Bachem, Sarah Perrin, Alexandre Ramé 
ICLR 2025 
[ paper ]
Conditioned Language Policy: A General Framework for Steerable Multi-Objective Finetuning 
Kaiwen Wang, Rahul Kidambi, Ryan Sullivan, Alekh Agarwal, Christoph Dann, Andrea Michi, Marco Gelmi, Yunxuan Li, Raghav Gupta, Avinava Dubey, Alexandre Ramé, Johan Ferret, Geoffrey Cideron, Le Hou, Hongkun Yu, Amr Ahmed, Aranyak Mehta, Léonard Hussenot, Olivier Bachem, Edouard Leurent 
Findings of EMNLP 2025 
[ paper ]
WARM: On the Benefits of Weight Averaged Reward Models 
Alexandre Ramé, Nino Vieillard, Léonard Hussenot, Robert Dadashi, Geoffrey Cideron, Olivier Bachem, Johan Ferret 
ICML 2024 
[ paper ]
RLAIF: Scaling Reinforcement Learning from Human Feedback with AI Feedback 
Harrison Lee, Samrat Phatale, Hassan Mansoor, Thomas Mesnard, Johan Ferret, Kellie Lu, Colton Bishop, Ethan Hall, Victor Carbune, Abhinav Rastogi, Sushant Prakash 
ICML 2024 
[ paper ]
A Survey of Temporal Credit Assignment in Deep Reinforcement Learning 
Eduardo Pignatelli, Johan Ferret, Matthieu Geist, Hado van Hasselt, Olivier Pietquin, Laura Toni 
TMLR 2024 
[ paper ]
Factually Consistent Summarization via Reinforcement Learning with Textual Entailment Feedback 
Johan Ferret*, Paul Roit*, Lior Shani*, Roee Aharoni, Geoffrey Cideron, Robert Dadashi, Matthieu Geist, Sertan Girgin, Léonard Hussenot, Orgad Keller, Nikola Momchev, Sabela Ramos, Piotr Stanczyk, Nino Vieillard, Olivier Bachem, Gal Elidan, Avinatan Hassidim, Olivier Pietquin, Idan Szpektor 
ACL 2023 
[ paper | code soon! ]
On Actions that Matter: Credit Assignment and Interpretability in Reinforcement Learning 
Johan Ferret 
PhD thesis 
[ manuscript | slides ]
Lazy-MDPs: Towards Interpretable Reinforcement Learning by Learning When to Act 
Johan Ferret*, Alexis Jacq*, Olivier Pietquin, Matthieu Geist 
AAMAS 2022 
[ paper | code soon! ]
There is no Turning Back: A Self-Supervised Approach to Reversibility-Aware Reinforcement Learning 
Johan Ferret*, Nathan Grinsztajn*, Olivier Pietquin, Philippe Preux, Matthieu Geist 
NeurIPS 2021 
[ paper | blog post | slides | code ]
Self-Imitation Advantage Learning 
Johan Ferret, Olivier Pietquin, Matthieu Geist 
AAMAS 2021 
[ paper | slides | code ]
Adversarially Guided Actor-Critic 
Johan Ferret*, Yannis Flet-Berliac*, Olivier Pietquin, Philippe Preux, Matthieu Geist 
ICLR 2021 
[ paper | slides | video | code ]
Self-Attentional Credit Assignment for Transfer in Reinforcement Learning 
Johan Ferret, Raphaël Marinier, Matthieu Geist, Olivier Pietquin 
IJCAI 2020 
[ paper | slides | video ]
Preprints
Gemma 3 Technical Report 
Gemma Team 
Technical report 
[ paper ]
Humanity’s Last Exam 
HLE Team 
Technical report 
[ paper ]
WARP: On the Benefits of Weight Averaged Rewarded Policies 
Alexandre Ramé, Johan Ferret, Nino Vieillard, Robert Dadashi, Léonard Hussenot, Pierre-Louis Cedoz, Pier Giuseppe Sessa, Sertan Girgin, Arthur Douillard, Olivier Bachem 
arxiv preprint 
[ paper ]
Gemma 2: Improving Open Language Models at a Practical Size 
Gemma Team 
Technical report 
[ paper ]
RecurrentGemma: Moving Past Transformers for Efficient Open Language Models 
Griffin Team, RLHF Team, Gemma Team 
Technical report 
[ paper ]
Gemma: Open Models Based on Gemini Research and Technology 
Gemma Team 
Technical report 
[ paper ]
Direct Language Model Alignment from Online AI Feedback 
Shangmin Guo, Biao Zhang, Tianlin Liu, Tianqi Liu, Misha Khalman, Felipe Llinares, Alexandre Ramé, Thomas Mesnard, Yao Zhao, Bilal Piot, Johan Ferret, Mathieu Blondel 
arxiv preprint 
[ paper ]
Gemini: A Family of Highly Capable Multimodal Models 
Gemini Team 
Technical report 
[ paper ]
Acme: A Research Framework for Distributed Reinforcement Learning 
Acme Team 
arxiv preprint 
[ paper | colab | code ]
Workshops
Assessing the Zero-Shot Capabilities of LLMs for Action Evaluation in RL 
Eduardo Pignatelli, Johan Ferret, Davide Paglieri, Samuel Coward, Tim Rocktäschel, Edward Grefenstette, Laura Toni 
AutoRL workshop, ICML 2024 
[ paper ]
More Efficient Exploration with Symbolic Priors on Action Sequence Equivalence 
Nathan Grinsztajn, Toby Johnstone, Johan Ferret, Philippe Preux 
Deep Reinforcement Learning workshop, NeurIPS 2022 
[ paper ]
Offline Credit Assignment in Deep Reinforcement Learning with Hindsight Discriminator Networks 
Johan Ferret, Olivier Pietquin, Matthieu Geist 
EWRL 2022 
[ paper ]
Credit Assignment as a Proxy for Transfer in Reinforcement Learning 
Johan Ferret, Raphaël Marinier, Matthieu Geist, Olivier Pietquin 
Learning Transferable Skills workshop, NeurIPS 2019 (oral) 
[ paper ]
