About me

I'm a Ph.D. candidate at Cornell Tech in NYC, advised by Prof. Volodymyr Kuleshov. My research focuses on Diffusion Language Models — faster and more controllable alternatives to traditional LLMs. I am currently on the industry job market.

Previously: IIT Kharagpur (Bachelors, EE major); Google Research - Mountain View.

Projects

Select Papers

  • Subham S. Sahoo, Justin Deschenaux, Aaron Gokaslan, Guanghan Wang, Justin Chiu, Volodymyr Kuleshov. The Diffusion Duality. Pre-print 2025 , ICLR 2025 - DeLTa Workshop (oral). [paper, code, project]


    Subham S. Sahoo, Marianne Arriola, Yair Schiff, Aaron Gokaslan, Edgar Marroquin, Justin T Chiu, Alexander Rush, Volodymyr Kuleshov. Simple and Effective Masked Diffusion Language Models. 38th Conference on Neural Information Processing Systems (NeurIPS 2024), ICML 2024 - AccMLBio Workshop (spotlight). [paper, code, project]


    Subham S. Sahoo, Aaron Gokaslan, Chris De Sa, Volodymyr Kuleshov. Diffusion Models With Learned Adaptive Noise. 38th Conference on Neural Information Processing Systems (NeurIPS 2024, spotlight), NeurIPS 2024 - Compression Workshop (spotlight). [paper, code, project]


    Subham S. Sahoo*, Anselm Paulus*, Marin Vlastelica, Vit Musil, Volodymyr Kuleshov, Georg Martius. Backpropagation through Combinatorial Algorithms: Identity with Projection Works. 11th International Conference on Learning Representations (ICLR 2023). [paper, code]


    Subham S. Sahoo, Christoph H. Lampert, Georg Martius. Learning Equations for Extrapolation and Control. 35th International Conference on Machine Learning (ICML 2018). [paper, code, project]

News

  • Mar-24-25: Invited talk at Genesis Therapeutics on Diffusion Language Models.

    Mar-19-25: Invited for Research/Industrial Inference/PostTraining focused Round Table at Nvidia GTC-2025.

    Mar-7-25: Invited talk at Nvidia on Diffusion Language Models. [slides]

    Mar-6-24: DUO has been accepted to ICLR 2025 - DeLTa workshop as an oral!

    Feb-11-25: BD3-LM and UDLM accepted at ICLR 2025! BD3-LM has been accepted as an oral!

    Dec-10-24: MDLM and MuLAN accepted at NeurIPS 2025! MuLAN was presented as a spotlight!

    Oct-11-24: Passed my candidacy exam!

    Jul-27-24: Presented MDLM as a spotlight at ICML 2024, AccMLBio workshop!

Papers

  • Subham S. Sahoo, Justin Deschenaux, Aaron Gokaslan, Guanghan Wang, Justin Chiu, Volodymyr Kuleshov. The Diffusion Duality. Pre-print 2025, ICLR 2025 - DeLTa Workshop (oral). [paper, code, project]


    Guanghan Wang, Yair Schiff, Subham S. Sahoo, Volodymyr Kuleshov. Remasking Discrete Diffusion Models with Inference-Time Scaling. Pre-print 2025 , ICLR 2025 - DeLTa Workshop. [paper, code, project]


    Marianne Arriola, Subham S. Sahoo, Aaron Gokaslan, Zhihan Yang, Zhixuan Qi, Jiaqi Han, Justin Chiu, Volodymyr Kuleshov. Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models. 13th International Conference on Learning Representations (ICLR 2025, oral). [paper, code, project]


    Yair Schiff*, Subham S. Sahoo*, Hao Phung*, Guanghan Wang*, Sam Boshar, Hugo Dalla-torre, Bernardo P de Almeida, Alexander M Rush, Thomas Pierrot, Volodymyr Kuleshov. Simple Guidance Mechanisms for Discrete Diffusion Models. 13th International Conference on Learning Representations (ICLR 2025). [paper, code, project]

  • Subham S. Sahoo, Marianne Arriola, Yair Schiff, Aaron Gokaslan, Edgar Marroquin, Justin T Chiu, Alexander Rush, Volodymyr Kuleshov. Simple and Effective Masked Diffusion Language Models. 38th Conference on Neural Information Processing Systems (NeurIPS 2024), ICML 2024 - AccMLBio Workshop (spotlight). [paper, code, project]


    Subham S. Sahoo, John X. Morris, Aaron Gokaslan, Srijeeta Biswas, Vitaly Shamtikov, Volodymyr Kuleshov. Gradient-Free Classifier-Based Guidance for Diffusion Models. Under Review, 2024.


    Subham S. Sahoo, Aaron Gokaslan, Chris De Sa, Volodymyr Kuleshov. Diffusion Models With Learned Adaptive Noise. 38th Conference on Neural Information Processing Systems (NeurIPS 2024, spotlight), NeurIPS 2024 - Compression Workshop (spotlight). [paper, code, project]

  • Subham S. Sahoo, Anselm Paulus, Marin Vlastelica, Vit Musil, Volodymyr Kuleshov, Georg Martius. Backpropagation through Combinatorial Algorithms: Identity with Projection Works. 11th International Conference on Learning Representations (ICLR 2023). [paper, code]


    Phillip Si, Zeyi Chen, Subham S. Sahoo, Subham S. Sahoo, Yair Schiff, Volodymyr Kuleshov. Semi-Autoregressive Energy Flows: Towards Determinant-Free Training of Normalizing Flows. 40th International Conference on Machine Learning (ICML 2023). [paper]

  • Subham S. Sahoo, Subhashini Venugopalan, Li Li, Rishabh Singh, Patrick Riley. Scaling Symbolic Methods using Gradients for Neural Model Explanation. 9th International Conference on Learning Representations (ICLR 2021). [paper, code]


    Subham S. Sahoo, Ross Anderson, Christian Tjandraatmadja. Local Search on TPUs. pre-print, 2021. [paper]

  • Subham S. Sahoo. Training Neual Networks using SAT solvers. pre-print, 2018. [paper]


    Subham S. Sahoo, Christoph H. Lampert, Georg Martius. Learning Equations for Extrapolation and Control. 35th International Conference on Machine Learning (ICML 2018). [paper, code, project]

Panels & Talks

  • Panels

    Mar-19-25: Research/Industrial Inference/PostTraining focused Round Table at Nvidia GTC-2025.

  • Invited Talks

    Apr-2-25: At Databricks , "Simple and Effective Masked Diffusion Language Models".


    Mar-24-25: At Genesis Therapeutics, "Simple and Effective Masked Diffusion Language Models".


    Mar-7-25: At Nvidia, "Diffusion Language Models". [slides]

  • Contributed Talks

    Apr-28-24: At ICLR 2025 DeLTA Workshop, "The Diffusion Duality".


    Dec-15-24: At NeurIPS 2024 Compression Workshop, "Diffusion Models with Learned Adaptive Noise". [slides]


    Jul-27-24: At ICML 2024 AccMLBio Workshop, "Simple and Effective Masked Diffusion Language Models". [slides].

Background

Education

  1. Cornell Tech, New York, USA.

    2022 — present

    Ph.D. in Computer Science.
    Thesis: Diffusion Language Models.
    Committee: Prof. Volodymyr Kuleshov (chair), Prof. Noah Snavely, Prof. Bart Selman.

  2. Indian Institute of Technology - Kharagpur, India.

    2015 — 2019

    Bachelor's in Electrical Engineering.

Experience

  1. Cruise, San Francisco, USA.

    2023 (May - July)

    Research intern.
    Team: AV Behaviors.

  2. Max Planck Institute for Intelligent Systems, Tubingen, Germany.

    2021 (Aug - Dec)

    Visiting Researcher.
    Team: Autonomous Learning Group.

  3. Google Research, Mountain View, USA.

    2019 — 2021

    AI Resident.
    Teams: Accelerated Science, Operations Research.