December 8, 2018
We derive a simple and model-independent formula for the change in the generalization gap due to a gradient descent update. We then compare the change in the test error for stochastic gradient descent to the change in test error from an equivalent number of gradient descent updates and show explicitly that stochastic gradient descent acts to regularize generalization error by decorrelating nearby updates. These calculations depends on the details of the model only through the mean and covariance of the gradient distribution, which may be readily measured for particular models of interest. We discuss further improvements to these calculations and comment on possible implications for stochastic optimization.
Written by
February 27, 2025
Pascal Kesseli, Peter O'Hearn, Ricardo Silveira Cabral
February 27, 2025
February 06, 2025
Andros Tjandra, Yi-Chiao Wu, Baishan Guo, John Hoffman, Brian Ellis, Apoorv Vyas, Bowen Shi, Sanyuan Chen, Matt Le, Nick Zacharov, Carleigh Wood, Ann Lee, Wei-Ning Hsu
February 06, 2025
February 06, 2025
Jarod Levy, Mingfang (Lucy) Zhang, Svetlana Pinet, Jérémy Rapin, Hubert Jacob Banville, Stéphane d'Ascoli, Jean Remi King
February 06, 2025
February 06, 2025
Mingfang (Lucy) Zhang, Jarod Levy, Stéphane d'Ascoli, Jérémy Rapin, F.-Xavier Alario, Pierre Bourdillon, Svetlana Pinet, Jean Remi King
February 06, 2025
April 08, 2021
Caner Hazirbas, Joanna Bitton, Brian Dolhansky, Jacqueline Pan, Albert Gordo, Cristian Canton Ferrer
April 08, 2021
April 30, 2018
Tomer Galanti, Lior Wolf, Sagie Benaim
April 30, 2018
April 30, 2018
Yedid Hoshen, Lior Wolf
April 30, 2018
December 11, 2019
Eliya Nachmani, Lior Wolf
December 11, 2019
Foundational models
Our approach
Latest news
Foundational models