The graph (or network) structure in data is increasingly used to improve statistical signal processing and machine learning (ML) methods. In this talk, I will discuss several data science projects leveraging graphs that span from theory to applications, from classical optimization to modern deep learning tools, and from physical graphs to abstract and data-dependent graphs.
First part of the talk focuses on graph regularization, which is a technique that drives the solutions of an optimization problem to fit the graph. I will discuss trend filtering and matrix factorization with applications in traffic analysis and remote sensing. In particular, I used spectral graph theory and high-dimensional statistics to show that solutions to a particular non-convex graph-regularized problem are provably more accurate in a manner proportional to how well the observed data aligns with the given graph, mathematically confirming our intuition about learning with graphs.
Next, I discuss recent efforts to answer the question, “How can different but related groups help each other learn from their data better?” Applying graph regularization inspired a new approach to multi-task learning that linearly combines linear estimators, with promising results on an income prediction task where the real-world data is disaggregated by race. I plan to continue this line of research both in improving the methods and in applications to health care.
Lastly, I showcase my contributions to science of science, which is a field in computational social science that studies the progress of science itself. Specifically, I use graph embedding and transformers to predict which ML topics will be studied together in the near future from a large, time-varying semantic network. I will conclude with a discussion on my future research directions on graph-based learning with a focus on potential social science applications.
Harlin Lee is a Hedrick Assistant Adjunct Professor at UCLA Mathematics. She received her PhD in Electrical and Computer Engineering and MS in Machine Learning from Carnegie Mellon University in 2021. Prior to that, she got her BS and MEng in Electrical Engineering and Computer Science from MIT in 2016 and 2017, respectively.
Her research is on learning from high-dimensional data supported on structures (such as graphs/networks or low-dimensional subspace), motivated by applications in healthcare and social science including data science for social good. She has been recognized with Rising Stars in Data Science (2022), Rising Stars in Computational and Data Sciences (2022), CMU ECE Outstanding Woman in Engineering (2021), and Best Poster Prize at February Fourier Talks (2019).