June 19, 2023
The size of deep neural networks has grown exponentially in recent years. Unfortunately, hardware devices have not kept pace with the rapidly increasing memory requirements. To cope with this, researchers have proposed various techniques including spilling, rematerialization, reduced precision training, model pruning, and so on. However, these approaches suffer from various limitations, such as increasing training time, affecting model accuracy, or requiring extensive manual modifications to the neural networks. We present MODeL, an algorithm that optimizes the lifetime and memory location of the tensors used to train neural networks. Our method automatically reduces the memory usage of existing neural networks without any of the drawbacks of other techniques. We formulate the problem as a joint integer linear program (ILP). We present several techniques to simplify the encoding of the problem, and enable our approach to scale to the size of state-of-the-art neural networks using an off-the-shelf ILP solver. We experimentally demonstrate that MODeL only takes seconds to allow the training of neural networks using 30% less memory on average. MODeL is an open-source project available at https://github.com/facebookresearch/model_opt.
Publisher
ICML
Research Topics
June 27, 2024
Chris Cummins, Volker Seeker, Dejan Grubisic, Baptiste Rozière, Jonas Gehring, Gabriel Synnaeve, Hugh Leather
June 27, 2024
June 14, 2024
Mostafa Elhoushi, Akshat Shrivastava, Diana Liskovich, Basil Hosmer, Bram Wasti, Liangzhen Lai, Nas Mahmoud, Bilge Acun, Saurabh Agarwal, Ahmed Roman, Ahmed Aly, Beidi Chen, Carole-Jean Wu
June 14, 2024
June 07, 2024
Carole-Jean Wu, Bilge Acun, Ramya Raghavendra, Kim Hazelwood
June 07, 2024
November 07, 2023
Jared Fernandez, Jacob Kahn, Clara Na, Yonatan Bisk, Emma Strubell
November 07, 2023
Product experiences
Foundational models
Product experiences
Latest news
Foundational models