publications

publications by categories in reversed chronological order. generated by jekyll-scholar.

2024

  1. under review
    mice4cat.png
    MICE for CATs: Model-Internal Confidence Estimation for Calibrating Agents with Tools
    Nishant Subramani, Jason Eisner, Justin Svegliato, Benjamin Van Durme, Yu Su, and Sam Thomson
    Preprint, 2024
  2. ACL
    olmo.png
    Best Paper
    OLMo: Accelerating the Science of Language Models
    Dirk Groeneveld, Iz Beltagy, Evan Walsh, Akshita Bhagia, Rodney Kinney, Oyvind Tafjord, Ananya Jha, Hamish Ivison, Ian Magnusson, Yizhong Wang, Shane Arora, David Atkinson, and 31 more authors
    In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Aug 2024
  3. ACL
    dolma.png
    Best Resource Paper
    Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research
    Luca Soldaini, Rodney Kinney, Akshita Bhagia, Dustin Schwenk, David Atkinson, Russell Authur, Ben Bogin, Khyathi Chandu, Jennifer Dumas, Yanai Elazar, Valentin Hofmann, Ananya Jha, and 24 more authors
    In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Aug 2024
  4. TrustNLP @ NAACL
    trustnlp_2024_randr.png
    Evaluating Personal Information Parroting in Language Models
    Nishant Subramani, Kshitish Ghate, and Mona Diab
    Jun 2024

2023

  1. GEM @ EMNLP
    emnlp2023_catwalk.png
    Extended Abstract
    Robust Tooling and New Resources for Large Language Model Evaluation via Catwalk
    Kyle Richardson, Ian Magnusson, Oyvind Tafjord, Akshita Bhagia, Iz Beltagy, Arman Cohan, Pradeep Dasigi, Jesse Dodge, Dirk Groeneveld, Yuling Gu, Tushar Harsh Jha, and Nishant Subramani
    Dec 2023
  2. TrustNLP @ ACL
    trustnlp2023_pi.png
    Detecting Personal Information in Training Corpora: an Analysis
    Nishant Subramani, Sasha Luccioni, Jesse Dodge, and Margaret Mitchell
    In Proceedings of the 3rd Workshop on Trustworthy Natural Language Processing (TrustNLP 2023), Dec 2023

2022

  1. GEM @ EMNLP
    emnlp2022_pinocchio.png
    Don’t Say What You Don’t Know: Improving the Consistency of Abstractive Summarization by Constraining Beam Search
    Daniel King, Zejiang Shen, Nishant Subramani, Daniel S. Weld, Iz Beltagy, and Doug Downey
    In Proceedings of the 2nd Workshop on Natural Language Generation, Evaluation, and Metrics (GEM), Dec 2022
  2. EMNLP
    emnlp2022_gemv2.png
    GEMv2: Multilingual NLG Benchmarking in a Single Line of Code
    Sebastian Gehrmann, Abhik Bhattacharjee, Abinaya Mahendiran, Alex Wang, Alexandros Papangelis, Aman Madaan, Angelina Mcmillan-major, Anna Shvets, Ashish Upadhyay, Bernd Bohnet, Bingsheng Yao, Bryan Wilie, and 65 more authors
    In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, Dec 2022
  3. ACL
    steering_vecs_acl22.png
    Extracting Latent Steering Vectors from Pretrained Language Models
    Nishant Subramani, Nivedita Suresh, and Matthew Peters
    In Findings of the Association for Computational Linguistics: ACL 2022, May 2022
  4. BigScience Workshop
    bloom_fig.png
    BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
    Teven Le Scao, Angela Fan, Christopher Akiki, Ellie Pavlick, Suzana Ili’c, Daniel Hesslow, Roman Castagn’e, Alexandra Sasha Luccioni, François Yvon, Matthias Gallé, Jonathan Tow, Alexander M. Rush, and 379 more authors
    ArXiv, May 2022
  5. FAccT
    facct2022_datagov.png
    Data Governance in the Age of Large-Scale Data-Driven Language Technology
    Yacine Jernite, Huu Nguyen, Stella Biderman, Anna Rogers, Maraim Masoud, Valentin Danchev, Samson Tan, Alexandra Sasha Luccioni, Nishant Subramani, Isaac Johnson, Gerard Dupont, Jesse Dodge, and 8 more authors
    In Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency, May 2022
  6. TACL
    tacl2022_quality_at_a_glance.png
    Quality at a Glance: An Audit of Web-Crawled Multilingual Datasets
    Julia Kreutzer, Isaac Caswell, Lisa Wang, Ahsan Wahab, Daan Esch, Nasanbayar Ulzii-Orshikh, Allahsera Tapo, Nishant Subramani, Artem Sokolov, Claytone Sikasote, Monang Setyawan, Supheakmungkol Sarin, and 40 more authors
    Transactions of the Association for Computational Linguistics, May 2022

2021

  1. GEM @ ACL
    gemv1_table.png
    The GEM Benchmark: Natural Language Generation, its Evaluation and Metrics
    Sebastian Gehrmann, Tosin Adewumi, Karmanya Aggarwal, Pawan Sasanka Ammanamanchi, Anuoluwapo Aremu, Antoine Bosselut, Khyathi Raghavi Chandu, Miruna-Adriana Clinciu, Dipanjan Das, Kaustubh Dhole, Wanyu Du, Esin Durmus, and 44 more authors
    In Proceedings of the 1st Workshop on Natural Language Generation, Evaluation, and Metrics (GEM 2021), Aug 2021
  2. DataCentricAI @ NeurIPS
    nao_fig.png
    Natural Adversarial Objects
    Felix Lau, Nishant Subramani, Sasha Harrison, Aerin Kim, Elliot Branson, and Rosanne Liu
    ArXiv, Aug 2021

2020

  1. MLRSA @ NeurIPS
    doc_analysis_survey_fig.png
    A Survey of Deep Learning Approaches for OCR and Document Understanding
    Nishant Subramani, Alexandre Matton, Malcolm Greaves, and Adrian Lam
    ArXiv, Aug 2020
  2. arXiv
    arxiv_steering_vec_fig.png
    Discovering Useful Sentence Representations from Large Pretrained Language Models
    Nishant Subramani, and Nivedita Suresh
    ArXiv, Aug 2020
  3. AAAI
    aaai2020_fakespeech_fig.png
    Learning Efficient Representations for Fake Speech Detection
    Nishant Subramani, and Delip Rao
    In AAAI Conference on Artificial Intelligence, Aug 2020

2019

  1. NeurIPS
    neurips2019_lstm_steering.png
    Can unconditional language models recover arbitrary sentences?
    Nishant Subramani, Samuel Bowman, and Kyunghyun Cho
    Advances in Neural Information Processing Systems, Aug 2019

2018

  1. CausalML @ ICML
    pag2admg_fig.png
    Pag2admg: An Algorithm for the Complete Causal Enumeration of a Markov Equivalence Class
    Nishant Subramani
    International Conference of Machine Learning CausalML Workshop, Aug 2018

2017

  1. AAAI
    pag2admg_abstract.png
    Student Abstract
    PAG2ADMG: A Novel Methodology to Enumerate Causal Graph Structures
    Nishant Subramani, and Doug Downey
    Proceedings of the AAAI Conference on Artificial Intelligence, Aug 2017