Research

Speech & Audio

Attention-Based WaveNet Autoencoder for Universal Voice Conversion

April 30, 2019

Abstract

We present a method for converting any voice to a target voice. The method is based on a WaveNet autoencoder, with the addition of a novel attention component that supports the modification of timing between the input and the output samples. Training the attention is done in an unsupervised way, by teaching the neural network to recover the original timing from an artificially modified one. Adding a generic voice robot, which we convert to the target voice, we present a robust Text To Speech pipeline that is able to train without any transcript. Our experiments show that the proposed method is able to recover the timing of the speaker and that the proposed pipeline provides a competitive Text To Speech method.

Download the Paper

Related Publications

February 06, 2025

Speech & Audio

Meta Audiobox Aesthetics: Unified Automatic Quality Assessment for Speech, Music, and Sound

Andros Tjandra, Yi-Chiao Wu, Baishan Guo, John Hoffman, Brian Ellis, Apoorv Vyas, Bowen Shi, Sanyuan Chen, Matt Le, Nick Zacharov, Carleigh Wood, Ann Lee, Wei-Ning Hsu

February 06, 2025

November 19, 2020

Speech & Audio

Generating Fact Checking Briefs

Angela Fan, Aleksandra Piktus, Antoine Bordes, Fabio Petroni, Guillaume Wenzek, Marzieh Saeidi, Sebastian Riedel, Andreas Vlachos

November 19, 2020

November 09, 2020

Speech & Audio

Multilingual AMR-to-Text Generation

Angela Fan

November 09, 2020

October 26, 2020

Speech & Audio

Deep Multilingual Transformer with Latent Depth

Xian Li, Asa Cooper Stickland, Xiang Kong, Yuqing Tang

October 26, 2020

December 11, 2019

Speech & Audio

Computer Vision

Hyper-Graph-Network Decoders for Block Codes | Facebook AI Research

Eliya Nachmani, Lior Wolf

December 11, 2019

April 30, 2018

Speech & Audio

VoiceLoop: Voice Fitting and Synthesis via a Phonolgoical Loop | Facebook AI Research

Yaniv Taigman, Lior Wolf, Adam Polyak, Eliya Nachmani

April 30, 2018

July 11, 2018

Speech & Audio

Fitting New Speakers Based on a Short Untranscribed Sample | Facebook AI Research

Eliya Nachmani, Adam Polyak, Yaniv Taigman, Lior Wolf

July 11, 2018

May 05, 2019

Speech & Audio

A Universal Music Translation Network | Facebook AI Research

Noam Mor, Lior Wolf, Adam Polyak, Yaniv Taigman

May 05, 2019

Help Us Pioneer The Future of AI

We share our open source frameworks, tools, libraries, and models for everything from research exploration to large-scale production deployment.