publications

publications by categories in reversed chronological order. generated by jekyll-scholar.

2025

  1. Under Review
    simba.png
    Making Sense of LLM Benchmarks from the Performance Matrices Alone
    Nishant Subramani*, Alfredo Gomez*, and Mona T. Diab
    Preprint, 2025
  2. Under Review
    mimicroscope.png
    LLM Microscope: What Model Internals Reveal About Answer Correctness and Context Use
    Jiarui Liu*, Jivitesh Jain*, Mona T. Diab, and Nishant Subramani
    Preprint, 2025
  3. Under Review
    modelinternalsleuthing.png
    Model Internal Sleuthing: Finding Lexical Identity and Inflectional Morphology in Modern Language Models
    Michael Li, and Nishant Subramani
    Preprint, 2025
  4. Under Review
    pythiaparrot.png
    Personal Information Parroting in Language Models
    Nishant Subramani, Kshitish Ghate, and Mona Diab
    Preprint, 2025
  5. NAACL
    mice4cat.png
    MICE for CATs: Model-Internal Confidence Estimation for Calibrating Agents with Tools
    Nishant Subramani, Jason Eisner, Justin Svegliato, Benjamin Van Durme, Yu Su, and Sam Thomson
    In Proceedings of NAACL, 2025

2024

  1. ACL
    olmo.png
    Best Paper
    OLMo: Accelerating the Science of Language Models
    Dirk Groeneveld, Iz Beltagy, Evan Walsh, Akshita Bhagia, Rodney Kinney, Oyvind Tafjord, Ananya Jha, Hamish Ivison, Ian Magnusson, Yizhong Wang, Shane Arora, David Atkinson, and 31 more authors
    In Proceedings of ACL, 2024
  2. ACL
    dolma.png
    Best Resource Paper
    Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research
    Luca Soldaini, Rodney Kinney, Akshita Bhagia, Dustin Schwenk, David Atkinson, Russell Authur, Ben Bogin, Khyathi Chandu, Jennifer Dumas, Yanai Elazar, Valentin Hofmann, Ananya Jha, and 24 more authors
    In Proceedings of ACL, 2024
  3. NAACL
    trustnlp_2024_randr.png
    Evaluating Personal Information Parroting in Language Models
    Nishant Subramani, Kshitish Ghate, and Mona Diab
    2024

2023

  1. EMNLP
    emnlp2023_catwalk.png
    Extended Abstract
    Robust Tooling and New Resources for Large Language Model Evaluation via Catwalk
    Kyle Richardson, Ian Magnusson, Oyvind Tafjord, Akshita Bhagia, Iz Beltagy, Arman Cohan, Pradeep Dasigi, Jesse Dodge, Dirk Groeneveld, Yuling Gu, Tushar Harsh Jha, and Nishant Subramani
    2023
  2. ACL
    trustnlp2023_pi.png
    Detecting Personal Information in Training Corpora: an Analysis
    Nishant Subramani, Sasha Luccioni, Jesse Dodge, and Margaret Mitchell
    In Proceedings of the TrustNLP Workshop, 2023

2022

  1. EMNLP
    emnlp2022_pinocchio.png
    Don’t Say What You Don’t Know: Improving the Consistency of Abstractive Summarization by Constraining Beam Search
    Daniel King, Zejiang Shen, Nishant Subramani, Daniel S. Weld, Iz Beltagy, and Doug Downey
    In Proceedings of the GEM Workshop, 2022
  2. EMNLP
    emnlp2022_gemv2.png
    GEMv2: Multilingual NLG Benchmarking in a Single Line of Code
    Sebastian Gehrmann, Abhik Bhattacharjee, Abinaya Mahendiran, Alex Wang, Alexandros Papangelis, Aman Madaan, Angelina Mcmillan-major, Anna Shvets, Ashish Upadhyay, Bernd Bohnet, Bingsheng Yao, Bryan Wilie, and 65 more authors
    In Proceedings of EMNLP System Demonstrations, 2022
  3. ACL
    steering_vecs_acl22.png
    Extracting Latent Steering Vectors from Pretrained Language Models
    Nishant Subramani, Nivedita Suresh, and Matthew Peters
    In Findings of ACL, 2022
  4. BigScience Workshop
    bloom_fig.png
    BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
    Teven Le Scao, Angela Fan, Christopher Akiki, Ellie Pavlick, Suzana Ili’c, Daniel Hesslow, Roman Castagn’e, Alexandra Sasha Luccioni, François Yvon, Matthias Gallé, Jonathan Tow, Alexander M. Rush, and 379 more authors
    ArXiv, 2022
  5. FAccT
    facct2022_datagov.png
    Data Governance in the Age of Large-Scale Data-Driven Language Technology
    Yacine Jernite, Huu Nguyen, Stella Biderman, Anna Rogers, Maraim Masoud, Valentin Danchev, Samson Tan, Alexandra Sasha Luccioni, Nishant Subramani, Isaac Johnson, Gerard Dupont, Jesse Dodge, and 8 more authors
    In Proceedings of FAccT, 2022
  6. TACL
    tacl2022_quality_at_a_glance.png
    Quality at a Glance: An Audit of Web-Crawled Multilingual Datasets
    Julia Kreutzer, Isaac Caswell, Lisa Wang, Ahsan Wahab, Daan Esch, Nasanbayar Ulzii-Orshikh, Allahsera Tapo, Nishant Subramani, Artem Sokolov, Claytone Sikasote, Monang Setyawan, Supheakmungkol Sarin, and 40 more authors
    TACL, 2022

2021

  1. ACL
    gemv1_table.png
    The GEM Benchmark: Natural Language Generation, its Evaluation and Metrics
    Sebastian Gehrmann, Tosin Adewumi, Karmanya Aggarwal, Pawan Sasanka Ammanamanchi, Anuoluwapo Aremu, Antoine Bosselut, Khyathi Raghavi Chandu, Miruna-Adriana Clinciu, Dipanjan Das, Kaustubh Dhole, Wanyu Du, Esin Durmus, and 44 more authors
    In Proceedings of the GEM Workshop, 2021
  2. NeurIPS
    nao_fig.png
    Natural Adversarial Objects
    Felix Lau, Nishant Subramani, Sasha Harrison, Aerin Kim, Elliot Branson, and Rosanne Liu
    DataCentricAI Workshop, 2021

2020

  1. NeurIPS
    doc_analysis_survey_fig.png
    A Survey of Deep Learning Approaches for OCR and Document Understanding
    Nishant Subramani, Alexandre Matton, Malcolm Greaves, and Adrian Lam
    MLRSA Workshop, 2020
  2. arXiv
    arxiv_steering_vec_fig.png
    Discovering Useful Sentence Representations from Large Pretrained Language Models
    Nishant Subramani, and Nivedita Suresh
    ArXiv, 2020
  3. AAAI
    aaai2020_fakespeech_fig.png
    Learning Efficient Representations for Fake Speech Detection
    Nishant Subramani, and Delip Rao
    In Proceedings of AAAI, 2020

2019

  1. NeurIPS
    neurips2019_lstm_steering.png
    Can unconditional language models recover arbitrary sentences?
    Nishant Subramani, Samuel Bowman, and Kyunghyun Cho
    Advances in NeurIPS, 2019

2018

  1. ICML
    pag2admg_fig.png
    Pag2admg: An Algorithm for the Complete Causal Enumeration of a Markov Equivalence Class
    Nishant Subramani
    Causal ML Workshop, 2018

2017

  1. AAAI
    pag2admg_abstract.png
    Student Abstract
    PAG2ADMG: A Novel Methodology to Enumerate Causal Graph Structures
    Nishant Subramani, and Doug Downey
    Proceedings of AAAI, 2017