Computer Vision

ML Applications

Improving Optical Flow on a Pyramid Level

August 22, 2020

Abstract

In this work we review the coarse-to-fine spatial feature pyramid concept, which is used in state-of-the-art optical flow estimation networks to make exploration of the pixel flow search space computationally tractable and efficient. Within an individual pyramid level, we improve the cost volume construction process by departing from a warping- to a sampling-based strategy, which avoids ghosting and hence enables us to better preserve fine flow details. We further amplify the positive effects through a level-specific, loss max-pooling strategy that adaptively shifts the focus of the learning process on under-performing predictions. Our second contribution revises the gradient flow across pyramid levels. The typical operations performed at each pyramid level can lead to noisy, or even contradicting gradients across levels. We show and discuss how properly blocking some of these gradient components leads to improved convergence and ultimately better performance. Finally, we introduce a distillation concept to counteract the issue of catastrophic forgetting during finetuning and thus preserving knowledge over models sequentially trained on multiple datasets. Our findings are conceptually simple and easy to implement, yet result in compelling improvements on relevant error measures that we demonstrate via exhaustive ablations on datasets like Flying Chairs2, Flying Things, Sintel and KITTI. We establish new state-of-the-art results on the challenging Sintel and KITTI 2012 test datasets, and even show the portability of our findings to different optical flow and depth from stereo approaches.

Download the Paper

AUTHORS

Written by

Markus Hofinger

Samuel Rota Bulò

Lorenzo Porzi

Arno Knapitsch

Thomas Pock

Peter Kontschieder

Publisher

European Conference on Computer Vision (ECCV)

Related Publications

May 26, 2026

Human & Machine Intelligence

Theory

Misalignment Between Backpropagation and the Hierarchy of Brain Responses to Images

Valentin Wyart, Huy V. Vo, Jean Remi King, Josephine Raugel, Jérémy Rapin, Marc Szafraniec, Max Seitzer, Patrick Labatut, Piotr Bojanowski

May 26, 2026

May 19, 2026

Human & Machine Intelligence

EgoBabyVLM: Benchmarking Cross-Modal Learning from Naturalistic Egocentric Video Data

Alvin W. M. Tan, Nicolas Hamilakis, Manel Khentout, Sho Tsuji, Balázs Kégl, Michael C. Frank, Angel Villar Corrales, Charles-Eric Saint-James, Dongyan Lin, Emmanuel Dupoux, Jiayi Shen, Juan Pino, Mahi Luthra, Martin Gleize, Phillip Rust, Rashel Moritz, Sheila Krogh-Jespersen, Surya Parimi, Tom Fizycki, Vanessa Stark, Yosuke Higuchi, Youssef Benchekroun

May 19, 2026

May 12, 2026

Human & Machine Intelligence

NeuralSet: A High-Performing Python Package for Neuro-AI

Corentin Bel, Linnea Evanson, Julien Gadonneix, Andrea Santos Revilla, Mingfang (Lucy) Zhang, Julie Bonnaire, Charlotte Caucheteux, Alexandre Défossez, Théo Desbordes, Pablo Diego-Simón, Shubh Khanna, Juliette Millet, Pierre Orhan, Saarang Panchavati, Antoine Ratouchniak, Alexis Thual, Hubert Jacob Banville, Jarod Levy, Jean Remi King, Josephine Raugel, Jérémy Rapin, Katelyn Begany, Marlene Careil, Simon Dahan, Sophia Houhamdi, Stéphane d'Ascoli, Teon Brooks, Yohann Benchetrit

May 12, 2026

February 27, 2026

Human & Machine Intelligence

Unified Vision–Language Modeling via Concept Space Alignment

Yifu Qiu, Holger Schwenk, Paul-Ambroise Duquenne

February 27, 2026

June 11, 2019

Computer Vision

ELF OpenGo: An Analysis and Open Reimplementation of AlphaZero | Facebook AI Research

Yuandong Tian, Jerry Ma, Qucheng Gong, Shubho Sengupta, Zhuoyuan Chen, James Pinkerton, Larry Zitnick

June 11, 2019

April 30, 2018

NLP

Computer Vision

Mastering the Dungeon: Grounded Language Learning by Mechanical Turker Descent | Facebook AI Research

Zhilin Yang, Saizheng Zhang, Jack Urbanek, Will Feng, Alexander H. Miller, Arthur Szlam, Douwe Kiela, Jason Weston

April 30, 2018

October 10, 2016

Speech & Audio

Computer Vision

Polysemous Codes | Facebook AI Research

Matthijs Douze, Hervé Jégou, Florent Perronnin

October 10, 2016

June 18, 2018

Speech & Audio

Computer Vision

Low-shot learning with large-scale diffusion | Facebook AI Research

Matthijs Douze, Arthur Szlam, Bharath Hariharan, Hervé Jégou

June 18, 2018

Help Us Pioneer The Future of AI

We share our open source frameworks, tools, libraries, and models for everything from research exploration to large-scale production deployment.