Over the past several years my work has focused on making cutting edge AI technologies more widely accessible through supporting the release of several of the world's most powerful generative models such as GPT-NeoX, BLOOM, VQGAN-CLIP, and OpenFold.
Recently my focus has shifted towards building better understandings of how and why these models work, and what decisions engineers can make to instill desirable behaviors and limit undesirable behaviors in these types of models. I am especially excited about mechanistic interpretability research and work investigating learning dynamics of large language models.
My paper Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling was accepted for an oral at ICML! (April 2023)
M.S. in Computer Science, the Georgia Institute of Technology, 2022
B.S. in Mathematics with honors, the University of Chicago, 2016
B.A. in Philosophy, the University of Chicago, 2016