logo EECS Rising Stars 2023




Sarah Wiegreffe

Towards Transparent Language Models



Research Abstract:

Recent language models have attracted much attention not only for their major successes but also for their more subtle but still plentiful failures. My research argues that transparency into how language models make predictions can rectify these failures and increase model utility in a reliable way. In today's fast changing NLP landscape, techniques for increasing and leveraging transparency must be developed for both open-source models as well as black-box models behind an API. I focus on producing textual explanations meaningful to human users under two definitions of meaning—faithfulness and human acceptability—and test suites for measuring each. I analyze textual explanations produced by deep learning models and propose methods to improve their quality. I additionally focus on how improved transparency in both the open-source and black-box settings can increase language model performance on downstream tasks such as commonsense reasoning and multi-task question answering.

Bio:

Sarah Wiegreffe is a postdoctoral researcher at the Allen Institute for AI (AI2), working on the Aristo project. She also holds a courtesy appointment in the Allen School of Computer Science and Engineering at the University of Washington. Her research focuses on understanding how language models make predictions in an effort to make them more transparent to human users. She received her PhD from Georgia Tech in 2022 advised by Professor Mark Riedl, during which time she interned at Google and AI2 and won the AI2 outstanding intern award. She frequently serves on conference program committees, receiving outstanding area chair award at ACL 2023.