logo EECS Rising Stars 2023




Teodora Baluta

Rigorous Security Analysis for Machine Learning Systems



Research Abstract:

Several class-action lawsuits for copyright infringement have been filed recently by artists against machine learning creators which could cost billions of dollars. Is there an algorithm that the parties and the court can use to resolve such disputes, beyond reasonable doubt? What if the training dataset contains private or sensitive information, how can one distinguish between generalization (intended) vs. memorization (unintended)? My research is on answering such questions with rigorous security analysis for machine learning systems. Security of machine learning systems is a new area. It has many gaps, including in defining security vulnerabilities and developing provably sound procedures for security. My work addresses such gaps. First, we show that a training step in stochastic gradient descent, the de-facto training process, is collision-resistant, under precise definitions. This provides the building block for the first provable mechanism to resolve disputes over training datasets. Second, we provide new useful abstractions for stochastic gradient descent through the lens of causality. Lastly, we give the first sound procedures for verifying statistical properties such as robustness, fairness, and susceptibility to poisoning of the training data. Many more open questions remain, which motivate the need for better security definitions and analysis from first principles.

Bio:

Teodora Baluta is a Ph.D. candidate in Computer Science at the National University of Singapore. Teodora works on security problems that are both algorithmic and practically relevant. Her thesis is on rigorous security analysis for machine learning systems. She focuses on foundational aspects of security in machine learning: precise vulnerability definitions, models for reasoning about the process of training and sound procedures for statistical verifiability. Her research has been recognized by the Google PhD Fellowship, EECS Rising Stars, the Dean’s Graduate Research Excellence Award and President’s Graduate Fellowship at NUS. She interned at Google Brain working in the Learning for Code team. Her works are published in security (CCS, NDSS), programming languages/verification conferences (OOPSLA, SAT), and software engineering conferences (ICSE, ESEC/FSE). Teodora is currently on the job market for academic positions starting Fall 2024.