Computer Vision

ML Applications

Improving Optical Flow on a Pyramid Level

August 22, 2020

Abstract

In this work we review the coarse-to-fine spatial feature pyramid concept, which is used in state-of-the-art optical flow estimation networks to make exploration of the pixel flow search space computationally tractable and efficient. Within an individual pyramid level, we improve the cost volume construction process by departing from a warping- to a sampling-based strategy, which avoids ghosting and hence enables us to better preserve fine flow details. We further amplify the positive effects through a level-specific, loss max-pooling strategy that adaptively shifts the focus of the learning process on under-performing predictions. Our second contribution revises the gradient flow across pyramid levels. The typical operations performed at each pyramid level can lead to noisy, or even contradicting gradients across levels. We show and discuss how properly blocking some of these gradient components leads to improved convergence and ultimately better performance. Finally, we introduce a distillation concept to counteract the issue of catastrophic forgetting during finetuning and thus preserving knowledge over models sequentially trained on multiple datasets. Our findings are conceptually simple and easy to implement, yet result in compelling improvements on relevant error measures that we demonstrate via exhaustive ablations on datasets like Flying Chairs2, Flying Things, Sintel and KITTI. We establish new state-of-the-art results on the challenging Sintel and KITTI 2012 test datasets, and even show the portability of our findings to different optical flow and depth from stereo approaches.

Download the Paper

AUTHORS

Written by

Markus Hofinger

Samuel Rota Bulò

Lorenzo Porzi

Arno Knapitsch

Thomas Pock

Peter Kontschieder

Publisher

European Conference on Computer Vision (ECCV)

Related Publications

June 17, 2019

Computer Vision

DMC-Net: Generating Discriminative Motion Cues for Fast Compressed Video Action Recognition | Facebook AI Research

Zheng Shou, Xudong Lin, Yannis Kalantidis, Laura Sevilla-Lara, Marcus Rohrbach, Shih-Fu Chang, Zhicheng Yan

June 17, 2019

June 18, 2019

Computer Vision

Embodied Question Answering in Photorealistic Environments with Point Cloud Perception | Facebook AI Research

Erik Wijmans, Samyak Datta, Oleksandr Maksymets, Abhishek Das, Georgia Gkioxari, Stefan Lee, Irfan Essa, Devi Parikh, Dhruv Batra

June 18, 2019

July 28, 2019

Speech & Audio

Computer Vision

Learning to Optimize Halide with Tree Search and Random Programs | Facebook AI Research

Andrew Adams, Karima Ma, Luke Anderson, Riyadh Baghdadi, Tzu-Mao Li, Michaël Gharbi, Benoit Steiner, Steven Johnson, Kayvon Fatahalian, Frédo Durand, Jonathan Ragan-Kelley

July 28, 2019

June 17, 2019

Computer Vision

Graph-Based Global Reasoning Networks | Facebook AI Research

Yunpeng Chen, Marcus Rohrbach, Zhicheng Yan, Shuicheng Yan, Jiashi Feng, Yannis Kalantidis

June 17, 2019

Help Us Pioneer The Future of AI

We share our open source frameworks, tools, libraries, and models for everything from research exploration to large-scale production deployment.