About me
I work on Diffusion Language Models and have led several of the foundational developments that shaped this emerging field. My work is used at industrial scale by Google, NVIDIA, and ByteDance across domains such as language generation and drug discovery.
Ph.D. Thesis:
Foundations of Diffusion Language Models, advised by
Prof. John Thickstun.
Previously: Cornell Tech (Ph.D.);
Google Research; IIT Kharagpur
(B.Tech).
Highlights
Select Papers
-
Subham S. Sahoo, Justin Deschenaux, Aaron Gokaslan, Guanghan Wang, Justin Chiu, Volodymyr Kuleshov. The Diffusion Duality. 42nd International Conference on Machine Learning (ICML 2025), ICLR 2025 - DeLTa Workshop (oral). [paper, code, webpage]
Subham S. Sahoo, Marianne Arriola, Yair Schiff, Aaron Gokaslan, Edgar Marroquin, Justin T Chiu, Alexander Rush, Volodymyr Kuleshov. Simple and Effective Masked Diffusion Language Models. 38th Conference on Neural Information Processing Systems (NeurIPS 2024), ICML 2024 - AccMLBio Workshop (spotlight). [paper, code, webpage]
Subham S. Sahoo, Aaron Gokaslan, Chris De Sa, Volodymyr Kuleshov. Diffusion Models With Learned Adaptive Noise. 38th Conference on Neural Information Processing Systems (NeurIPS 2024, spotlight), NeurIPS 2024 - Compression Workshop (spotlight). [paper, code, webpage]
Subham S. Sahoo*, Anselm Paulus*, Marin Vlastelica, Vit Musil, Volodymyr Kuleshov, Georg Martius. Backpropagation through Combinatorial Algorithms: Identity with Projection Works. 11th International Conference on Learning Representations (ICLR 2023). [paper, code]
Subham S. Sahoo, Christoph H. Lampert, Georg Martius. Learning Equations for Extrapolation and Control. 35th International Conference on Machine Learning (ICML 2018). [paper, code, webpage]
News
-
Oct-23-25: Invited talk at Radboud University on The Diffusion Duality.
Oct-15-25: Invited talk at Seoul National University on Foundations of Diffusion Language Models. [slides]
Oct-3-25: Defended my Ph.D. Thesis: Foundations of Diffusion Language Models. [slides]
Aug-13-25: Invited talk at Cerebras on Esoteric Language Models.
Aug-6-25: Invited talk at Meta (FAIR) on Foundations of Diffusion Language Models.
Jun-19-25: Invited talk at Google Deepmind on Esoteric Language Models.
May-1-25: Duo accepted at ICML 2025!
Apr-28-25: Presenting Duo as an oral at ICLR 2025, DeLTa workshop!
Apr-2-25: Invited talk at Databricks on Diffusion Language Models.
Mar-24-25: Invited talk at Genesis Therapeutics on Diffusion Language Models.
Mar-19-25: Invited for Research/Industrial Inference/PostTraining focused Round Table at Nvidia GTC-2025.
Mar-7-25: Invited talk at Nvidia on Diffusion Language Models. [slides]
Feb-11-25: BD3-LM and UDLM accepted at ICLR 2025! BD3-LM has been accepted as an oral!
Dec-10-24: MDLM and MuLAN accepted at NeurIPS 2025! MuLAN was presented as a spotlight!
Oct-11-24: Passed my Ph.D. Candidacy exam!
Jul-27-24: Presented MDLM as a spotlight at ICML 2024, AccMLBio workshop!