RESEARCH

SPEECH & AUDIO

PerspectiveNet: A Scene-consistent Image Generator for New View Synthesis in Real Indoor Environments

November 13, 2019

Abstract

Given a set of a reference RGBD views of an indoor environment, and a new viewpoint, our goal is to predict the view from that location. Prior work on new- view generation has predominantly focused on significantly constrained scenarios, typically involving artificially rendered views of isolated CAD models. Here we tackle a much more challenging version of the problem. We devise an approach that exploits known geometric properties of the scene (per-frame camera extrinsics and depth) in order to warp reference views into the new ones. The defects in the generated views are handled by a novel RGBD inpainting network, PerspectiveNet, that is fine-tuned for a given scene in order to obtain images that are geometrically consistent with all the views in the scene camera system. Experiments conducted on the ScanNet and SceneNet datasets reveal performance superior to strong baselines.

Download the Paper

AUTHORS

Written by

David Novotny

Benjamin Graham

Jeremy Reizenstein

Publisher

NeurIPS

Related Publications

August 01, 2024

SPEECH & AUDIO

NLP

Toward Joint Language Modeling for Speech Units and Text

Ju-Chieh Chou, Wei-Ning Hsu, Karen Livescu, Arun Babu, Alexis Conneau, Alexei Baevski, Michael Auli

August 01, 2024

July 23, 2024

HUMAN & MACHINE INTELLIGENCE

CONVERSATIONAL AI

The Llama 3 Herd of Models

Llama team

July 23, 2024

June 25, 2024

SPEECH & AUDIO

NLP

Textless Acoustic Model with Self-Supervised Distillation for Noise-Robust Expressive Speech-to-Speech Translation

Min-Jae Hwang, Ilia Kulikov, Benjamin Peloquin, Hongyu Gong, Peng-Jen Chen, Ann Lee

June 25, 2024

June 05, 2024

SPEECH & AUDIO

Proactive Detection of Voice Cloning with Localized Watermarking

Robin San Romin, Pierre Fernandez, Hady Elsahar, Alexandre Deffosez, Teddy Furon, Tuan Tran

June 05, 2024

Help Us Pioneer The Future of AI

We share our open source frameworks, tools, libraries, and models for everything from research exploration to large-scale production deployment.