I am a PhD student at CMU in the Language Technologies Institute (LTI) advised by Mona Diab. I am interested in building responsible and controllable NLP systems through understanding the internals of language models with an eye towards steering their generations in a reliable, trustworthy, and efficient manner. I also have experience working on large language model initiatives such as BLOOM and am currently involved in the OLMo initiative at AI2.

Before CMU, I was a predoctoral researcher at The Allen Institute for Artificial Intelligence on the AllenNLP team, where I worked with Matt Peters on controllable text generation and steering language models. I also collaborated with Margaret Mitchell and Sasha Luccioni on personal information in large web corpora. I am also an NLP researcher affiliated with Masakhane, an open source and distributed research effort for NLP for African languages. I have also spent time in industry working on controlling language models, document understanding, optical character recognition, fake speech detection, and speech syntehsis at different companies in both an applied and research context. I completed my MS in Computer Science at the Courant Institute at NYU in the CILVR group focusing on deep learning applied to NLP. I also completed my B.A. in Statistics and M.S. in Computer Science focusing on ML and NLP at Northwestern University working with Doug Downey.

Broadly, my research interests are:

  • Natural Language Processing
  • Responsible NLP
  • Causality
  • Controllable Text Generation
  • Model Editing
  • Efficient NLP

I follow both international and club football (soccer), NBA basketball, and professional tennis very closely. I’m a huge supporter of Borussia Dortmund from the German Bundesliga.

Publications

Semantic Scholar


OLMo: Accelerating the Science of Language Models

Dirk Groeneveld, Iz Beltagy, and 41 others (including Nishant Subramani*)

Paper


Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research

Luca Soldaini and 35 others (including Nishant Subramani*)

Paper


Robust Tooling and New Resources for Large Language Model Evaluation via Catwalk

Kyle Richardson, Ian Magnusson, Oyvind Tafjord, ..., Tushar Khot and Nishant Subramani*

Paper


Detecting Personal Information in Training Corpora: an Analysis

Nishant Subramani*, Alexandra Sasha Luccioni*, Jesse Dodge, and Margaret Mitchell

Paper


Don't Say What You Don't Know: Improving the Consistency of Abstractive Summarization by Constraining Beam Search

Daniel King, Zejiang Shen, Nishant Subramani, Daniel S. Weld, Iz Beltagy, and Doug Downey

Paper


GEMv2: Multilingual NLG Benchmarking in a Single Line of Code

Sebastian Gehrmann, and 50+ others (including Nishant Subramani)

Paper


Extracting Latent Steering Vectors from Pretrained Language Models

Nishant Subramani, Nivedita Suresh, and Matthew E. Peters

Paper


BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

Teven Le Scao and 300+ others (including Nishant Subramani)

Paper


Data Governance in the Age of Large-Scale Data-Driven Language Technology

Yacine Jernite, Huu Nguyen, Stella Biderman, Anna Rogers, Maraim Masoud, Valentin Danchev, Samson Tan, Alexandra Sasha Luccioni, Nishant Subramani, Isaac Johnson, Gérard Dupont, Jesse Dodge, Kyle Lo, Zeerak Talat, Dragomir Radev, Aaron Gokaslan, Somaieh Nikpoor, Peter Henderson, Rishi Bommasani and Margaret Mitchell

Paper


Quality at a Glance: An Audit of Web-Crawled Multilingual Datasets

Julia Kreutzer*, Isaac Caswell* and 47 others (including Nishant Subramani)

Paper

Paper

We audit web-crawled multilingual datasets and find that many corpora are completely erroneus. Furthermore, we find that for many languages, less than 50% of sentences are of acceptable quality.


The GEM Benchmark: Natural Language Generation, its Evaluation and Metrics

Sebastian Gehrmann, and 55 others (including Nishant Subramani)

Paper


Discovering Useful Sentence Representations from Large Pretrained Language Models

Nishant Subramani and Nivedita Suresh

Paper


Natural Adversarial Objects

Felix Lau, Nishant Subramani, Alexandra Harrison, Aerin Kim, Elliot Branson, Rosanne Liu.

Paper


A Survey of Deep Learning Approaches for OCR and Document Understanding

Nishant Subramani, Alexandre Matton, Malcolm Greaves, and Adrian Lam.

Paper


Learning Efficient Representations for Fake Speech Detection

Nishant Subramani and Delip Rao.

Paper


Can Unconditional Language Models Generate Arbitrary Sentences?

Nishant Subramani, Samuel R. Bowman, and Kyunghyun Cho.

Paper


PAG2ADMG (Statistics Bachelor's Thesis)

Nishant Subramani and Doug Downey.

Paper

Paper


Timeline

  • August 2023 - Present

    PhD Student at Carnegie Mellon University (Language Technologies Institute)

  • June 2021 - October 2023

    Predoctoral Young Investigator at the Allen Institute for AI (AllenNLP team)

  • January 2021 - June 2021

    Predoctoral Resident at Intel’s Intelligent Systems Lab

  • April 2020 - December 2020

    ML Research Scientist at Scale AI

  • July 2019 - January 2020

    Research Scientist at AI Foundation

  • September 2017 - May 2019

    MS in Computer Science (Machine Learning) at New York University

  • March 2017 - August 2017

    Deep Learning Research Intern at Salesforce Research

  • March 2016 - March 2017

    Research Assistant in Deep Learning & NLP at Northwestern University

  • January 2016 - March 2017

    MS in Computer Science at Northwestern University

  • September 2015 - January 2016

    Master’s Exchange Student in Computer Science at ETH Zurich

  • June 2015 - January 2016

    Research Assistant in Biomedical Informatics at Stanford University

  • July 2014 - March 2015

    Research Assistant in Neural Network Language Modeling at Northwestern University

  • September 2013 - March 2017

    BA in Statistics at Northwestern University

Contact

Drop me an email if you are interested in collaborating on research or have any questions regarding my projects.