SPEECH & AUDIO

NLP

Multi-task Learning for Front-end Text Processing in TTS

April 14, 2024

Abstract

We propose a multi-task learning (MTL) model for jointly performing three tasks that are commonly solved in a text-to-speech (TTS) front-end: text normalization (TN), part-of-speech (POS) tagging, and homograph disambiguation (HD). Our framework utilizes a tree-like structure with a trunk that learns shared representations, followed by separate task-specific heads. We further incorporate a pre-trained language model to utilize its built-in lexical and contextual knowledge, and study how to best use its embeddings so as to most effectively benefit our multi-task model. Through task-wise ablations, we show that our full model trained on all three tasks achieves the strongest overall performance compared to models trained on individual or sub-combinations of tasks, confirming the advantages of our MTL framework. Finally, we introduce a new HD dataset containing a balanced number of sentences in diverse contexts for a variety of homographs and their pronunciations. We demonstrate that incorporating this dataset into training significantly improves HD performance over only using a commonly used, but imbalanced, pre-existing dataset.

Download the Paper

AUTHORS

Written by

Yun Wang (Speech)

Arthur Hinsvark

Qing He

Shun Zhang

Wonjune Kang

Publisher

ICASSP

Related Publications

December 17, 2024

NLP

FLAME : Factuality-Aware Alignment for Large Language Models

Jack Lin, Luyu Gao, Barlas Oguz, Wenhan Xiong, Jimmy Lin, Scott Yih, Xilun Chen

December 17, 2024

December 12, 2024

NLP

CORE MACHINE LEARNING

Memory Layers at Scale

Vincent-Pierre Berges, Barlas Oguz

December 12, 2024

December 12, 2024

NLP

Byte Latent Transformer: Patches Scale Better Than Tokens

Artidoro Pagnoni, Ram Pasunuru, Pedro Rodriguez, John Nguyen, Benjamin Muller, Margaret Li, Chunting Zhou, Lili Yu, Jason Weston, Luke Zettlemoyer, Gargi Ghosh, Mike Lewis, Ari Holtzman, Srini Iyer

December 12, 2024

December 12, 2024

HUMAN & MACHINE INTELLIGENCE

NLP

Explore Theory-of-Mind: Program-Guided Adversarial Data Generation for Theory of Mind Reasoning

Melanie Sclar, Jane Yu, Maryam Fazel-Zarandi, Yulia Tsvetkov, Yonatan Bisk, Yejin Choi, Asli Celikyilmaz

December 12, 2024

Help Us Pioneer The Future of AI

We share our open source frameworks, tools, libraries, and models for everything from research exploration to large-scale production deployment.