logo EECS Rising Stars 2023




Yiru Chen

Interactive data Interface Generation and Optimization



Research Abstract:

Interactive data interfaces (e.g., visualizations, dashboards) are critical in nearly every stage of data management. I believe interactive data interfaces, rather than traditional programming, will be the primary method to empower the next billion data users, and have the potential to be as ubiquitous as web pages and mobile applications are today. However, the number and effectiveness of today’s data interfaces are a mere fraction of what’s possible. The primary reason is that current tools do not address the unique challenges of designing data interfaces and ensuring that they are scalable and highly responsive. As a result, even interfaces with basic functionality require vast amounts of resources and expertise to build. With this in mind, my research focuses on three primary research directions to drastically simplify how interactive data interfaces are designed, how backend systems are developed, and expand what interfaces are capable of. The first direction obviates the need for interface design and development. Even data scientists and programmers find it challenging to turn their analyses into usable interfaces because this requires considerable expertise and trial-and-error to find appropriate visualizations, interactions, and layouts to support the underlying analysis. To address this, I developed the PRECISION INTERFACE 2 (PI2) system, which is able to automatically generate fully interactive data interfaces from analysis queries. This project has developed a formal model of the mapping problem from queries to an interactive interface. As to the source of analysis queries, I have explored a sequence of SQL queries from database logs, analyses in notebooks, or query models in DBT. Alternatively, it can also be generated from a large language model. The second direction automatically optimizes the backend system design to guarantee responsiveness. Users care about interactivity and can detect even milliseconds of interaction delay. As a result, designers must make complex trade-offs between the interface design, levels of responsiveness for different interactions, the data size, and the resource implications to guarantee those levels of responsiveness. I have proposed a data interface grammar(DIG) to model the correspondence between interface interactions and queries to the backend system. Based on DIG, my ongoing project named Physical Visualization Design is to automatically surface the interactivity achievable from the available system resources and deploy optimizations needed to achieve the designer’s desired level of responsiveness. The third direction expands interface capabilities from data presentation to explanation. When users look at the data presentation, they will want to understand why. I developed DNI system to inspect the reinforcement learning models and TSExplain to explain the time series data.

Bio:

Yiru Chen is a Ph.D. candidate at Columbia University in the Computer Science Department, advised by Professor Eugene Wu. Her research goal is to help people at all technical levels to make sense of their data quickly. In particular, Yiru studies how to automatically generate interactive data interfaces from users' analysis queries and optimize the interface backend to guarantee responsiveness. Before her Ph.D., she earned her Bachelor's degree in Computer Science and Economics from Peking University in 2018. She was a research intern at Microsoft Research. She was the recipient of Google Ph.D. fellowship 2021 and has been invited to pariticipate in EECS Rising Stars Workshop 2023.