I am a predoctoral resident at The Allen Institute for Artificial Intelligence on the AllenNLP team, where I work with Matt Peters on controllable text generation and steering language models. I also collaborate with Margaret Mitchell on personal information in large web corpora. I am also an NLP researcher affiliated with Masakhane, an open source and distributed research effort for NLP for African languages.

Previously, I spent time in industry working on controlling language models, document understanding, optical character recognition, fake speech detection, and speech syntehsis at different companies in both an applied and research context. I completed my MS in Computer Science at the Courant Institute at NYU in the CILVR group focusing on deep learning applied to natural language processing and advised by Kyunghyun Cho and Sam Bowman. Before NYU, I completed my B.A. in Statistics and M.S. in Computer Science focusing on machine learning and natural language processing at Northwestern University working with Doug Downey.

Broadly, my research interests are:

  • Natural Language Processing
  • Causality
  • Democratizing NLP
  • Controllable Text Generation
  • Representation Learning
  • Efficient NLP

I follow both international and club football (soccer), NBA basketball, and professional tennis very closely. I’m a huge supporter of Borussia Dortmund from the German Bundesliga.


  • June 2021 - Present

    Predoctoral Young Investigator at the Allen Institute for AI (AllenNLP team)

  • January 2021 - June 2021

    Predoctoral Resident at Intel’s Intelligent Systems Lab

  • April 2020 - December 2020

    ML Research Scientist at Scale AI

  • July 2019 - January 2020

    Research Scientist at AI Foundation

  • September 2017 - May 2019

    MS in Computer Science (Machine Learning) at New York University

  • March 2017 - August 2017

    Deep Learning Research Intern at Salesforce Research

  • March 2016 - March 2017

    Research Assistant in Deep Learning & NLP at Northwestern University

  • January 2016 - March 2017

    MS in Computer Science at Northwestern University

  • September 2015 - January 2016

    Master’s Exchange Student in Computer Science at ETH Zurich

  • June 2015 - January 2016

    Research Assistant in Biomedical Informatics at Stanford University

  • July 2014 - March 2015

    Research Assistant in Neural Network Language Modeling at Northwestern University

  • September 2013 - March 2017

    BA in Statistics at Northwestern University


Semantic Scholar

Extracting Latent Steering Vectors from Pretrained Language Models

Nishant Subramani, Nivedita Suresh, and Matthew E. Peters


Data Governance in the Age of Large-Scale Data-Driven Language Technology

Yacine Jernite, Huu Nguyen, Stella Biderman, Anna Rogers, Maraim Masoud, Valentin Danchev, Samson Tan, Alexandra Sasha Luccioni, Nishant Subramani, Isaac Johnson, Gérard Dupont, Jesse Dodge, Kyle Lo, Zeerak Talat, Dragomir Radev, Aaron Gokaslan, Somaieh Nikpoor, Peter Henderson, Rishi Bommasani and Margaret Mitchell


Don't Say What You Don't Know: Improving the Consistency of Abstractive Summarization by Constraining Beam Search

Daniel King, Zejiang Shen, Nishant Subramani, Daniel S. Weld, Iz Beltagy, and Doug Downey


Quality at a Glance: An Audit of Web-Crawled Multilingual Datasets

Julia Kreutzer*, Isaac Caswell* and 47 others (including Nishant Subramani)



We audit web-crawled multilingual datasets and find that many corpora are completely erroneus. Furthermore, we find that for many languages, less than 50% of sentences are of acceptable quality.

The GEM Benchmark: Natural Language Generation, its Evaluation and Metrics

Sebastian Gehrmann, and 55 others (including Nishant Subramani)


We introduce a living benchmark for natural language generation, its evaluation, and its metrics. This is the initial release of the benchmark for a shared task at our ACL 2021 workshop.

Discovering Useful Sentence Representations from Large Pretrained Language Models

Nishant Subramani and Nivedita Suresh


We look at how to condition GPT-2 to generate arbitrary sentences without fine-tuning and with simple optimization. Paves the way for research on universal decoders.

Natural Adversarial Objects

Felix Lau, Nishant Subramani, Alexandra Harrison, Aerin Kim, Elliot Branson, Rosanne Liu.


We create a dataset consisting of difficult natural images in order to evaluate robustness of object detection systems. We choose images that state-of-the-art models are confidently incorrect on. We find that although state-of-the-art models have improved on some benchmarks, they perform consistently terribly on this dataset.

A Survey of Deep Learning Approaches for OCR and Document Understanding

Nishant Subramani, Alexandre Matton, Malcolm Greaves, and Adrian Lam.


We present a survey on deep learning approaches for optical character recognition and document understanding. We discuss methods for text detection, text transcription, document layout analysis, information extraction, and table understanding.

Learning Efficient Representations for Fake Speech Detection

Nishant Subramani and Delip Rao.


In this project, we focus on: 1) How can we build highly accurate, yet parameter and sample-efficient models for fake speech detection? 2) How can we rapidly adapt detection models to new sources of fake speech?

Can Unconditional Language Models Generate Arbitrary Sentences?

Nishant Subramani, Samuel R. Bowman, and Kyunghyun Cho.


Project in which we looked at how to condition an unconditional recurrent neural network language model.

PAG2ADMG (Statistics Bachelor's Thesis)

Nishant Subramani and Doug Downey.



Novel methodology which enumerates the full set of causal graphs by converting any partial ancestral graph (PAG) to the set of all acyclic directed mixed graphs (ADMGs) that belong to the same Markov equivalence class encoded by the PAG.


Drop me an email if you are interested in collaborating on research or have any questions regarding my projects.