Research Lead at LightOn.

🤖 > Work experience

  • 2020 - Now | Research Lead, LightOn. LightOn is funding my Ph.D.

I am leading a team of 10 researchers, engineers, and interns working on developing, understanding, and improving large language models.

We seek to establish principled methods for the scaling of large models. We are interested in understanding the nature of zero-shot generalization (ICML 2022), identifying promising architectures (EMNLP 2022), and improving pretraining datasets. We have received coverage in VentureBeat & ImportAI among others, and have contributed to the Big Science project.

We also put a strong focus on engineering and tooling. We've developped a pipeline to filter and deduplicate trillions of words; trained models with hundreds of billions of parameters with our own custom distributed training framework on supercomputers with thousands of A100s; and our inferrer processes billions of words every month for our customers. We work at very large scale, with a total annual compute budget exceeding 1M A100-hour per researcher.

2019 – 2020 | Machine Learning Research Scientist, LightOn. LightOn is funding my Ph.D.

I worked on expanding the applicability of beyond backpropagation methods to modern deep learning tasks and architectures (see our NeurIPS 2020 paper). I helped with the development of optical computing prototypes, achieving scalable optical training of neural networks of varied architectures. This work has lead to applications of Direct Feedback Alignment to adversarial robustness, as well as differential privacy.

🏫 > Education

2019 – Now | Industrial Ph.D. in Applied Mathematics.
École Normale Supérieure, Paris.
"Principled modeling methods and beyond backpropagation approaches for the large-scale era".

2018 – 2019 | M.Sc. in Climate Science.
École Polytechnique, Palaiseau.

2017 – 2018 | Visiting research student.
City University of Hong Kong, Kowloon.
"Machine learning for solar engineering".

2015 – 2019 | M.Sc. in Civil Engineering.
École Normale Supérieure, Paris-Saclay.

📘 > Publications

See my publications page or my Google Scholar profile.

My research has been featured in Yannic Kilcher videos, in Import AI, and news outlets such as VentureBeat (here and there).

🤗 > Service

2023 | Reviewer.
Conferences: ICML.
Journals: ACM Computing Surveys.

2021 - 2022 | Chair of the Architecture & Scaling Group, 🌸 BigScience.

I chaired the architecture & scaling working group for the Big Science workshop. Our goal was to empirically explore and validate architectural choices for BLOOM, a 176B-parameter open-access multilingual model . We studied considerations around model architecture & training objectives (encoder-decoder vs decoder-only, denoising vs language modelling), embeddings (rotary vs ALiBi), as well as multilinguality.

2022 | Reviewer.
Conferences: NeurIPS, ICML.
Journals: ACM Computing Surveys.
Workshops: NeurIPS I Can't Believe It's Not Better, ACL BigScience.

2021 | Reviewer.
Conferences: NeurIPS (Outstanding Reviewer Award).

December 2019 | Workshop organizer. Future of Random Matrices #4, Paris.

June 2019 | Science crew member, MOOSE-GE scientific campaign.
Mediterranean Sea, Thalassa vessel, 2 weeks.
Double-Diffusive Processes in the Tyrrhenian Sea.

👨‍🏫 > Talks

December 2022 | NeurIPS Scaling Laws Workshop, MILA.
High-quality data need not apply.

August 2022 | Translate Theory Reading Group, Google Research .
What Language Model Architecture and Pretraining Objective Work Best for Zero-Shot Generalization?

June 2022 | Machine Learning College, G-Research .
Lessons from training a massively multilingual 176B model.

May 2022 | Neural Scaling Seminar, MILA.
What Language Model Architecture and Pretraining Objective Work Best for Zero-Shot Generalization?

May 2022 | Challenges & Perspectives in Creating Large Language Models, ACL.
Lessons from training a massively multilingual 176B model.

April 2022 | Sharing Session, Naver AI Labs.
Demystifying Extreme-Scale Training.

March 2022 | GTC, NVIDIA.
NLP Beyond English: Training Extreme-scale Language Models with Megatron for French, and More.

December 2021 | BigScience Episode #3, NeurIPS.
Identifying the best architecture for a >100B model.

September 2021 | BigScience Episode #2, INLG.
You Only Train Once: Making Architectural Decisions for a >100B model.

July 2021 | Big Science Episode #1, ELLIS.
Architectural Decisions at the 100B scale.

July 2021 | Hong Kong ML Meetup S3E12.
Extreme-scale: Trends & Perspectives.

May 2021 | Paris NLP Meetup S5E5.
PAGnol: a French Extreme-Scale Model.

December 2020 | Sharing Session, Autodesk AI Lab.
Learning and Scaling Beyond Backpropagation and Beyond Silicon.

December 2020 | Les Déjeuners NeurIPS, Paris Machine Learning Meetup.
Direct Feedback Alignment: Scaling and Perspectives.

May 2019 | Future of Random Matrices #3.
Principled Training of Neural Networks with Direct Feedback Alignment.

January 2017 | Paris Machine Learning Meetup S5E4.
Cracking Crack Mechanics with GANs.

December 2017 | TensorFlow Paris Meetup.
Lifelike Concrete Cracking Patterns using TensorFlow & GANs.

🎫 > Beyond academia

  • 🤿 | Diving. I am a passionate DIR diver, and I am working towards a GUE Fundamentals technical pass, with a strong interest for cave diving. I have a strong interest in cave and exploration diving.
  • 👨🏻‍🍳 | Cooking. In particular sous-vide cooking, new cookery, and holistic cuisine. During 2020 lockdowns, I cooked my way through the Fat Duck Cookbook & the Eleven Madison Park cookbooks.
  • 🗺️ | Travel/adventures. In 2016, I drove 3,000km on a rickshaw in India, going from Shillong to Kochi, and raised money for Cool Earth. In 2018, I took a motorbike around Java, Bali, and Lombok in Indonesia. I have also done roadtrips in Yucatán, Taiwan, Vietnam, Iceland, and Europe.
You’ve successfully subscribed to Julien Launay
Welcome back! You’ve successfully signed in.
Great! You’ve successfully signed up.
Success! Your email is updated.
Your link has expired
Success! Check your email for magic link to sign-in.