CORE MACHINE LEARNING

Learning useful representations for shifting tasks and distributions

July 27, 2023

Abstract

Does the dominant approach to learn representations (as a side effect of optimizing an expected cost for a single training distribution) remain a good approach when we are dealing with multiple distributions? Our thesis is that such scenarios are better served by representations that are richer than those obtained with a single optimization episode. We support this thesis with simple theoretical arguments and with experiments utilizing an apparently naïve ensembling technique: concatenating the representations obtained from multiple training episodes using the same data, model, algorithm, and hyper-parameters, but different random seeds. These independently trained networks perform similarly. Yet, in a number of scenarios involving new distributions, the concatenated representation performs substantially better than an equivalently sized network trained with a single training run. This proves that the representations constructed by multiple training episodes are in fact different. Although their concatenation carries little additional information about the training task under the training distribution, it becomes substantially more informative when tasks or distributions change. Meanwhile, a single training episode is unlikely to yield such a redundant representation because the optimization process has no reason to accumulate features that do not incrementally improve the training performance.

Download the Paper

AUTHORS

Written by

Leon Bottou

Jianyu Zhang

Publisher

ICML

Research Topics

Core Machine Learning

Related Publications

July 21, 2024

CORE MACHINE LEARNING

From Neurons to Neutrons: A Case Study in Mechanistic Interpretability

Ouail Kitouni, Niklas Nolte, Samuel Pérez Díaz, Sokratis Trifinopoulos, Mike Williams

July 21, 2024

July 08, 2024

THEORY

CORE MACHINE LEARNING

An Adaptive Stochastic Gradient Method with Non-negative Gauss-Newton Stepsizes

Antonio Orvieto, Lin Xiao

July 08, 2024

June 17, 2024

HUMAN & MACHINE INTELLIGENCE

COMPUTER VISION

D-Flow: Differentiating through Flows for Controlled Generation

Heli Ben-Hamu, Omri Puny, Itai Gat, Brian Karrer, Uriel Singer, Yaron Lipman

June 17, 2024

June 17, 2024

COMPUTER VISION

CORE MACHINE LEARNING

Bespoke Non-Stationary Solvers for Fast Sampling of Diffusion and Flow Models

Neta Shaul, Uriel Singer, Ricky Chen, Matt Le, Ali Thabet, Albert Pumarola, Yaron Lipman

June 17, 2024

Help Us Pioneer The Future of AI

We share our open source frameworks, tools, libraries, and models for everything from research exploration to large-scale production deployment.