You can reach me at email@example.com.
🔥 Latest news
- 🔥 I am organizing an ICML workshop on Efficient Systems for Foundation Models! Join us in Hawaii to discuss the nitty gritty details of training and inference of large models: gpusgobrrr.com.
- 💾 The RefinedWeb paper is out! Check it out to learn more about how we bridged the gap between massive crawls and curated data, by leveraging stringent filtering and strict deduplication.
- 🚀 Looking for a model with state-of-the-art performance and permissive licensing for commercial applications? Falcon-40B is now open-source.
My research focuses on large language models:
- 📈 Challenges in scaling. Scaling has been the main driver of progress in machine learning for the past few years: I am interested in how we can keep that engine churning. Specifically, I am interested in challenges brought forth by ML becoming a so-called big science, with novel research directions at the crossroads of large-scale engineering and pure research.
- 💿 Data scalability. What makes some pretraining datasets better than others? How can we build quality datasets with trillions of tokens? Is the human part in RLHF truly needed, or can models bootstrap themselves?
- 🧠 Philosophy of mind. I am interested in how LLMs can gain human-like functions. This goes from deliberate reasoning and planning, to the acquisition of a theory of mind and its relation with works such as Julian Jaynes' bicameral mind. I am also interested in tool use, and how LLMs can learn to interact with their environment.
During my Ph.D., I also explored alternatives to backpropagation and using optical co-processors to train neural networks.