Stella Biderman - Publications

Papers

Currently Under Review

Azerbayev, Schoelkopf, Paster, et al. (incl. Biderman). "Llemma: An Open Language Model For Mathematics." [Paper] [Code] [Models] [Dataset]
Sanchez, Honglu Fan, Alexander Spangher, et al. (incl. Biderman). "Stay on topic with Classifier-Free Guidance." [Paper]
Zhang, Tigges, Biderman, Raginsky, and Ringer. "Can Transformers Learn to Solve Problems Recursively?" [Paper]
Belrose, et al. (incl. Biderman). "Eliciting Latent Predictions from Transformers with the Tuned Lens." [Paper]
Ahdritz, Bouatta, Kadyan, et al. (incl. Biderman). "OpenFold: Retraining AlphaFold2 yields new insights into its learning mechanisms and capacity for generalization." [Paper] [Code]
Le Scao*, Fan*, Akik*, et al. (incl. Biderman*). "BLOOM: A 176B-Parameter Open-Access Multilingual Language Model." [Paper] [Model]

2023

Biderman, Prashanth, Sutawika, Schoelkopf, Anthony, Purohit, and Raff. "Emergent and Predictable Memorization in Large Language Models." Advances in Neural Information Processing Systems. 2023. [Paper]
Belrose Schneider-Joseph, Ravfogel, Cotterell, Raff, and Biderman. "LEACE: Perfect linear concept erasure in closed form." Advances in Neural Information Processing Systems. 2023. [Paper]
Ruis, Khan, Biderman, Hooker, Rocktäschel, Grefenstette. "Large language models are not zero-shot communicators." Advances in Neural Information Processing Systems. 2023. Spotlight Presentation. [Paper]
Havrilla, Zhuravinskyi, Van Phung, et al. (incl. Biderman). "trlX: A Framework for Large Scale Reinforcement Learning from Human Feedback." Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. 2023. [Paper] [Code]
Peng, Anthony, Alcaid, et al. (incl. Biderman). "RWKV: Reinventing RNNs for the Transformer Era." Transactions of the 2023 Conference on Empirical Methods in Natural Language Processing. 2023. [Paper] [Blog Post] [Model] [Code]
Piktus, Ogundepo, Akiki, et al. (incl. Biderman). GAIA Search: Hugging Face and Pyserini Interoperability for NLP Training Data Exploration." Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics. 2023. [Paper] [Demo]
Zheng-Xin, Schoelkopf, Muennighoff, et al. (incl. Biderman) "BLOOM+1: Adding Language Support to BLOOM for Zero-Shot Prompting." Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics. 2023. [Paper]
Muennighoff, Wang, Sutawika, et al. (incl. Biderman). "Crosslingual Generalization through Multitask Finetuning." Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics. 2023. [Paper] [Models] [Code]
Biderman*, Schoelkopf*, et al. "Pythia: A Suite for Analyzing Large Language Models across Training and Scaling." Proceedings of the International Conference on Machine Learning (ICML). 2023. Oral Presentation. [Paper] [Models]
Alam, Raff, Biderman, Oates, and Holt. "Recasting Self-Attention with Holographic Reduced Representations." Proceedings of the International Conference on Machine Learning (ICML). 2023. [Paper] [Code]
Srivastava, Rastogi, Rao, et al. (incl. Biderman) "Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models." Transcations of Machine Learning Research. 2023. [Paper][Codebase]

2022

Laurençon, Saulnier, Wang, et al. (incl. Biderman). "The BigScience ROOTS Corpus: A 1.6 TB Composite Multilingual Dataset." Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks. 2022. [Paper] [Exploratory Tool]
Fries, Weber, Seelam, et al. (incl. Biderman) "BigBIO: A Framework for Data-Centric Biomedical Natural Language Processing." Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks. 2022. [Paper] [Codebase]
Phang, Bradley, Gao, Castricato, and Biderman. "EleutherAI: Going Beyond “Open Science” to “Science in the Open”." Proceedings of the Workshop on Broadening Research Collaborations in ML @ NeurIPS. 2022. [Paper]
Crowson*, Biderman*, et al. "VQGAN-CLIP: Open Domain Image Generation and Editing with Natural Language Guidance." In Proceedings of European Conference on Computer Vision (ECCV). 2022. [Paper] [Demo] [Code]
Biderman and Raff. "Fooling MOSS Detection with Pretrained Language Models." In Proceedings of the ACM International Conference on Information and Knowledge Management (CIKM). 2022. [Paper]

McMillan-Major, Alyafeai, Biderman, et al. "Documenting Geographically and Contextually Diverse Data Sources: The BigScience Catalogue of Language Data and Resources." arXiv. 2022. [Paper]

Jernite, Nguyen, Biderman, et al. "Data Governance in the Age of Large-Scale Data-Driven Language Technology." In the Proceedings of the ACM Conference on Fairness, Accountability, and Transparency (FAccT). 2022. [Paper]
Black*, Biderman*, Hallahan*, et al. "GPT-NeoX-20B: An Open-Source Autoregressive Language Model." In Proceedings of the ACL Workshop on Challenges & Perspectives in Creating Large Language Models. 2022. [Paper] [Model and Code]
Hesslow*, Le Scao*, Saulnier*, et al. (incl. Biderman). What Language Model to Train if You Have One Million GPU Hours? In Proceedings of the ACL Workshop on Challenges & Perspectives in Creating Large Language Models. 2022. [Paper]
Talat, Névéol, Biderman, et al. "You Reap What You Sow: On the Challenges of Bias Evaluation under Multilingual Settings." In Proceedings of the ACL Workshop on Challenges & Perspectives in Creating Large Language Models. 2022. [Paper]
Kreutze*, Caswell*, et al. (incl. Biderman). "Quality at a Glance: An Audit of Web-Crawled Multilingual Datasets." In Transactions of the Association for Computational Linguistics (TACL). 2022. [Paper]
Biderman, Bicheno, and Gao. "Datasheet for the Pile." arXiv. 2022. [Paper]
Sanh*, Webson*, Raffel,* Bach*, et al. (incl. Biderman). "Multitask Prompted Training Enables Zero-Shot Task Generalization." In Proceedings of the Tenth International Conference on Learning Representations (ICLR), 2022. [Paper] [Model]

2021

Matiana*, Smith*, Teehan*, Castricato*, Biderman*, Gao, and Frazier. "Cut the CARP: Fishing for zero-shot story evaluation." arXiv. 2021. [Paper]
Alcaide, Biderman, Telenti, and Maher. "Massively Parallel Natural Extension of Reference Frame for Efficient Internal to Cartesian Conversion." In the Journal of Computational Chemistry. 2021. [Paper] [Code]
Louis Castricato, Stella Biderman, David Thue, and Rogelio Cardona-Rivera. "Towards a Model-Theoretic View of Narratives." In Proceedings of the Third NAACL Workshop on Narrative Understanding. 2021. [Paper]
Churchill, Biderman, and Herrick. "Magic: the Gathering is Turing Complete." In Proceedings of the 10th International Conference on Fun with Algorithms (FUN). 2021. [arXiv] [Demo]

2020 and older

Gao, Biderman, Black, Golding, Hoppe, Foster, Phang, He, Thite, Nabeshima, Presser, and Leahy. "The Pile: An 800GB Dataset of Diverse Text for Language Modeling." arXiv. 2020. [Paper] [Datasheet] [Website] [Model]
Biderman and Scheirer. "Pitfalls in Machine Learning Research: Reexamining the Development Cycle". In Proceedings on "I Can't Believe It's Not Better!" at NeurIPS Workshops, PMLR. 2020. [Paper]
Biderman. "Magic: the Gathering is as Hard as Arithmetic." arXiv. 2020. [Paper]
Biderman. "Neural Networks on Groups." arXiv. 2019. [Paper]
Biderman*, Cuddy*, Li*, and Song*. The Sensitivity of k-Uniform Hypergraph Properties. In the 3rd Annual University of Chicago Undergraduate Research Symposium. 2016. [Paper]

Talks

Non-Archival Conferences

Masad, Biderman, Shishkoff, and Baird. "Predicting Crisis Behavior with Reinforcement Learning." The Military Operation Research Society's Emerging Techniques Forum. 2019. [Slides]
- Awarded the Eugene P. Visco Prize for best research by a junior analyst.
Masad, Biderman, Shishkoff, and Baird. "Reinforcement Learning in Conflict Escalation Games." Poster at the 35th Annual Meeting of the Society for Political Methodology. 2018. [Abstract] [Poster]
Biderman, Masad, and Lawson. "How to be Wrong, but Useful: A Case Study on Tool Selection in Social Network Analysis." The 2nd Annual North American Social Networks Conference. 2018. [Slides]

Invited Talks

"GPT-NeoX-20B: the Road to Open NLP Research." AI Sweden NLP Seminar. 2022. [Slides]
"Replication as a Security Threat: the Role of Open Source Research in AI Security." The AI Village @ DEF CON. 2021. [Slides]
Moderator for the AI Village's "Ethics & Bias Panel." The AI Village @ DEF CON. 2020. [Video]
"Verifiable Computation: How to Securely and Correctly Execute Code on the Cloud.” DC DevFest. 2019. [Slides] [Video]

* indicates co-lead authorship.