Subham Sahoo

About me

I'm a Ph.D. candidate at Cornell Tech in NYC, advised by Prof. Volodymyr Kuleshov. My research focuses on Diffusion Language Models — faster and more controllable alternatives to traditional LLMs. I am currently on the industry job market.

Previously: IIT Kharagpur (Bachelors, EE major); Google Research - Mountain View.

Projects

Select Papers

Subham S. Sahoo, Justin Deschenaux, Aaron Gokaslan, Guanghan Wang, Justin Chiu, Volodymyr Kuleshov. The Diffusion Duality. 42nd International Conference on Machine Learning (ICML 2025), ICLR 2025 - DeLTa Workshop (oral). [paper, code, project]

Subham S. Sahoo, Marianne Arriola, Yair Schiff, Aaron Gokaslan, Edgar Marroquin, Justin T Chiu, Alexander Rush, Volodymyr Kuleshov. Simple and Effective Masked Diffusion Language Models. 38th Conference on Neural Information Processing Systems (NeurIPS 2024), ICML 2024 - AccMLBio Workshop (spotlight). [paper, code, project]

Subham S. Sahoo, Aaron Gokaslan, Chris De Sa, Volodymyr Kuleshov. Diffusion Models With Learned Adaptive Noise. 38th Conference on Neural Information Processing Systems (NeurIPS 2024, spotlight), NeurIPS 2024 - Compression Workshop (spotlight). [paper, code, project]

Subham S. Sahoo*, Anselm Paulus*, Marin Vlastelica, Vit Musil, Volodymyr Kuleshov, Georg Martius. Backpropagation through Combinatorial Algorithms: Identity with Projection Works. 11th International Conference on Learning Representations (ICLR 2023). [paper, code]

Subham S. Sahoo, Christoph H. Lampert, Georg Martius. Learning Equations for Extrapolation and Control. 35th International Conference on Machine Learning (ICML 2018). [paper, code, project]

News

May-1-25: DUO accepted at ICML 2025!

Apr-28-25: Presenting DUO as an oral at ICLR 2025, DeLTa workshop!

Apr-2-25: Invited talk at Databricks on Diffusion Language Models.

Mar-24-25: Invited talk at Genesis Therapeutics on Diffusion Language Models.

Mar-19-25: Invited for Research/Industrial Inference/PostTraining focused Round Table at Nvidia GTC-2025.

Mar-7-25: Invited talk at Nvidia on Diffusion Language Models. [slides]

Feb-11-25: BD3-LM and UDLM accepted at ICLR 2025! BD3-LM has been accepted as an oral!

Dec-10-24: MDLM and MuLAN accepted at NeurIPS 2025! MuLAN was presented as a spotlight!

Oct-11-24: Passed my candidacy exam!

Jul-27-24: Presented MDLM as a spotlight at ICML 2024, AccMLBio workshop!

Papers

Subham S. Sahoo, Justin Deschenaux, Aaron Gokaslan, Guanghan Wang, Justin Chiu, Volodymyr Kuleshov. The Diffusion Duality. 42nd International Conference on Machine Learning (ICML 2025), ICLR 2025 - DeLTa Workshop (oral). [paper, code, project]

Guanghan Wang, Yair Schiff, Subham S. Sahoo, Volodymyr Kuleshov. Remasking Discrete Diffusion Models with Inference-Time Scaling. Pre-print 2025 , ICLR 2025 - DeLTa Workshop. [paper, code, project]

Marianne Arriola, Subham S. Sahoo, Aaron Gokaslan, Zhihan Yang, Zhixuan Qi, Jiaqi Han, Justin Chiu, Volodymyr Kuleshov. Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models. 13th International Conference on Learning Representations (ICLR 2025, oral). [paper, code, project]

Yair Schiff*, Subham S. Sahoo*, Hao Phung*, Guanghan Wang*, Sam Boshar, Hugo Dalla-torre, Bernardo P de Almeida, Alexander M Rush, Thomas Pierrot, Volodymyr Kuleshov. Simple Guidance Mechanisms for Discrete Diffusion Models. 13th International Conference on Learning Representations (ICLR 2025). [paper, code, project]
Subham S. Sahoo, Marianne Arriola, Yair Schiff, Aaron Gokaslan, Edgar Marroquin, Justin T Chiu, Alexander Rush, Volodymyr Kuleshov. Simple and Effective Masked Diffusion Language Models. 38th Conference on Neural Information Processing Systems (NeurIPS 2024), ICML 2024 - AccMLBio Workshop (spotlight). [paper, code, project]

Subham S. Sahoo, John X. Morris, Aaron Gokaslan, Srijeeta Biswas, Vitaly Shamtikov, Volodymyr Kuleshov. Gradient-Free Classifier-Based Guidance for Diffusion Models. Under Review, 2024.

Subham S. Sahoo, Aaron Gokaslan, Chris De Sa, Volodymyr Kuleshov. Diffusion Models With Learned Adaptive Noise. 38th Conference on Neural Information Processing Systems (NeurIPS 2024, spotlight), NeurIPS 2024 - Compression Workshop (spotlight). [paper, code, project]
Subham S. Sahoo, Anselm Paulus, Marin Vlastelica, Vit Musil, Volodymyr Kuleshov, Georg Martius. Backpropagation through Combinatorial Algorithms: Identity with Projection Works. 11th International Conference on Learning Representations (ICLR 2023). [paper, code]

Phillip Si, Zeyi Chen, Subham S. Sahoo, Subham S. Sahoo, Yair Schiff, Volodymyr Kuleshov. Semi-Autoregressive Energy Flows: Towards Determinant-Free Training of Normalizing Flows. 40th International Conference on Machine Learning (ICML 2023). [paper]
Subham S. Sahoo, Subhashini Venugopalan, Li Li, Rishabh Singh, Patrick Riley. Scaling Symbolic Methods using Gradients for Neural Model Explanation. 9th International Conference on Learning Representations (ICLR 2021). [paper, code]

Subham S. Sahoo, Ross Anderson, Christian Tjandraatmadja. Local Search on TPUs. pre-print, 2021. [paper]
Subham S. Sahoo. Training Neual Networks using SAT solvers. pre-print, 2018. [paper]

Subham S. Sahoo, Christoph H. Lampert, Georg Martius. Learning Equations for Extrapolation and Control. 35th International Conference on Machine Learning (ICML 2018). [paper, code, project]

Panels & Talks

Panels

Mar-19-25: Research/Industrial Inference/PostTraining focused Round Table at Nvidia GTC-2025.
Invited Talks

Apr-2-25: At Databricks , "Diffusion Language Models". [slides]

Mar-24-25: At Genesis Therapeutics, "Simple and Effective Masked Diffusion Language Models".

Mar-7-25: At Nvidia, "Diffusion Language Models". [slides]
Contributed Talks

Apr-28-24: At ICLR 2025 - DeLTA Workshop, "The Diffusion Duality".

Dec-15-24: At NeurIPS 2024 - Compression Workshop, "Diffusion Models with Learned Adaptive Noise". [slides]

Jul-27-24: At ICML 2024 - AccMLBio Workshop, "Simple and Effective Masked Diffusion Language Models". [slides].

Background

Education

Cornell Tech, New York, USA.
2022 — present
Ph.D. in Computer Science.
Thesis: Esoteric Language Models.
Committee: Prof. Volodymyr Kuleshov (chair), Prof. Noah Snavely, Prof. Bart Selman.
Indian Institute of Technology - Kharagpur, India.
2015 — 2019
Bachelor's in Electrical Engineering.

Experience

Cruise, San Francisco, USA.
2023 (May - July)
Research intern.
Team: AV Behaviors.
Max Planck Institute for Intelligent Systems, Tubingen, Germany.
2021 (Aug - Dec)
Visiting Researcher.
Team: Autonomous Learning Group.
Google Research, Mountain View, USA.
2019 — 2021
AI Resident.
Teams: Accelerated Science, Operations Research.

Projects

MDLM

DUO

Eso-LMs

Select Papers

News

Daniel lewis

Panels

Invited Talks

Contributed Talks

Education

Cornell Tech, New York, USA.

Indian Institute of Technology - Kharagpur, India.

Experience

Cruise, San Francisco, USA.

Max Planck Institute for Intelligent Systems, Tubingen, Germany.

Google Research, Mountain View, USA.