Nishant Subramani

Model Interpretability Researcher 🔎

prof_pic.jpg

Looking for research internships in summer 2025 focused on model interpretability; please reach out!

I am a 2nd year PhD student 🎓 at CMU in the Language Technologies Institute (LTI) advised by Mona Diab. I have published at most of the major ML and NLP venues and was a part of the BLOOM and OLMo open LLM efforts leading to two best paper awards at ACL2024 and a GeekWire innovation of the year award. I spent this past summer as an intern on the semantic machines team at Microsoft Research on Ben Van Durme and Jason Eisner’s team.

My main research interest is in model interpretability 🔎: the intersection of mechanistic and traditional interpretability. Specifically I am excited about understanding the internals of models and identifying steerable regions in the latent spaces of models. I wrote some of the first papers on steering vectors for LSTM-based models at NeurIPS 2019 and Transformer-based ones at ACL 2022, which have recently gained popularity. Understanding model internals is essential to help us build more responsible, controllable, trustworthy, and efficient NLP systems. If you’re interested in or actively working on model interpretability, I’d love to chat, please reach out!

Before CMU, I spent nearly two and a half years as a predoctoral researcher on the AllenNLP team at AI2, where I worked with Matt Peters. Before that I spent two years in industry at startups as a research scientist working on NLP, vision, speech, and multimodal applications. I have had the opportunity to work closely with some amazing collaborators at other institutions including Margaret Mitchell, Sasha Luccioni, Vladlen Koltun, and Doug Downey.


news

Aug 2024 OLMo won an outstanding paper award at ACL 2024!
Aug 2024 Dolma won an outstanding resource paper award at ACL 2024!
Jun 2024 At NAACL in Mexico City :mexico:; come say hi!
Jun 2024 In Seattle for the summer 🏔️ - started as a PhD research intern on the semantic machines team at Microsoft Research working with Sam Thomson and Yu Su on calibrating tool-using agents 🤖
May 2024 OLMo and Dolma accepted to the main conference at ACL. See my wonderful coauthors in Bangkok :thailand: in August!
Apr 2024 Evaluating Personal Information Parroting in Language Models has been accepted to TrustNLP! See you in Mexico City :mexico: in June!
Nov 2023 Had a wonderful time giving a talk on Steering vectors: an alternative way to steer language models at in Annie En-Shiun Lee’s group at OntarioTech!
Aug 2023 Started my PhD :mortar_board: at CMU LTI with Mona Diab on model interpretability :mag_right:

selected publications

  1. under review
    mice4cat.png
    MICE for CATs: Model-Internal Confidence Estimation for Calibrating Agents with Tools
    Nishant Subramani, Jason Eisner, Justin Svegliato, Benjamin Van Durme, Yu Su, and Sam Thomson
    Preprint, 2024
  2. ACL
    olmo.png
    Best Paper
    OLMo: Accelerating the Science of Language Models
    Dirk Groeneveld, Iz Beltagy, Evan Walsh, Akshita Bhagia, Rodney Kinney, Oyvind Tafjord, Ananya Jha, Hamish Ivison, Ian Magnusson, Yizhong Wang, Shane Arora, David Atkinson, and 31 more authors
    In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Aug 2024
  3. ACL
    dolma.png
    Best Resource Paper
    Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research
    Luca Soldaini, Rodney Kinney, Akshita Bhagia, Dustin Schwenk, David Atkinson, Russell Authur, Ben Bogin, Khyathi Chandu, Jennifer Dumas, Yanai Elazar, Valentin Hofmann, Ananya Jha, and 24 more authors
    In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Aug 2024
  4. ACL
    steering_vecs_acl22.png
    Extracting Latent Steering Vectors from Pretrained Language Models
    Nishant Subramani, Nivedita Suresh, and Matthew Peters
    In Findings of the Association for Computational Linguistics: ACL 2022, May 2022
  5. BigScience Workshop
    bloom_fig.png
    BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
    Teven Le Scao, Angela Fan, Christopher Akiki, Ellie Pavlick, Suzana Ili’c, Daniel Hesslow, Roman Castagn’e, Alexandra Sasha Luccioni, François Yvon, Matthias Gallé, Jonathan Tow, Alexander M. Rush, and 379 more authors
    ArXiv, May 2022
  6. NeurIPS
    neurips2019_lstm_steering.png
    Can unconditional language models recover arbitrary sentences?
    Nishant Subramani, Samuel Bowman, and Kyunghyun Cho
    Advances in Neural Information Processing Systems, May 2019