Effective Theory for Online Learning with Structured Data |
TYPE | Statistical & Bio Seminar |
Speaker: | Dr. Inbar Seroussi |
Affiliation: | Tel Aviv University |
Organizer: | Yariv Kafri |
Date: | 14.01.2024 |
Time: | 12:00 - 13:30 |
Location: | Lidow Nathan Rosen (300) |
Abstract: | Stochastic gradient descent (SGD) is a fundamental optimization technique in modern machine learning, yet a comprehensive understanding of its exceptional performance remains a challenge. Drawing on the rich history of this problem in statistical physics, which has provided insights into simple neural networks with isotropic Gaussian data, this talk reviews existing results and introduces a theory for SGD in high dimensions. Our theory extends to a broader class of models, accommodating data with general covariance structures and loss functions. We present limiting deterministic dynamics governed by low-dimensional order parameters, applicable to a spectrum of optimization problems, including linear and logistic regression, as well as two-layer neural networks. This framework also reveals the implicit bias in SGD. For each problem, the deterministic equivalent of SGD allows us to derive an equation for the generalization error. Moreover, we establish explicit conditions on the step size, ensuring the convergence and stability of SGD. |