SPEECH & AUDIO

Correction Focused Language Model Training for Speech Recognition

March 01, 2024

Abstract

Language models (LMs) have been commonly adopted to boost the performance of automatic speech recognition (ASR) particularly in domain adaptation tasks. Conventional way of LM training treats all the words in corpora equally, resulting in suboptimal improvements in ASR performance. In this work, we introduce a novel correction focused LM training approach which aims to prioritize ASR fallible words. The word-level ASR fallibility score, representing the likeli- hood of ASR mis-recognition, is defined and shaped as a prior word distribution to guide the LM training. To enable correction focused training with text-only corpora, large language models (LLMs) are employed as fallibility score predictors and text generators through multi-task fine-tuning. Experimental results for domain adaptation tasks demonstrate the effectiveness of our proposed method. Com- pared with conventional LMs, correction focused training achieves up to relatively 5.5% word error rate (WER) reduction in sufficient text scenarios. In insufficient text scenarios, LM training with LLM- generated text achieves up to relatively 13% WER reduction, while correction focused training further obtains up to relatively 6% WER reduction.

Download the Paper

AUTHORS

Written by

Yingyi Ma

Zhe Liu

Ozlem Kalinli

Publisher

ICASSP

Research Topics

Speech & Audio

Related Publications

May 24, 2024

SPEECH & AUDIO

NLP

DOC-RAG: ASR Language Model Personalization with Domain-Distributed Co-occurrence Retrieval Augmentation

Zhe Liu

May 24, 2024

April 14, 2024

SPEECH & AUDIO

NLP

Multi-task Learning for Front-end Text Processing in TTS

Yun Wang (Speech), Arthur Hinsvark, Qing He, Shun Zhang, Wonjune Kang

April 14, 2024

April 14, 2024

SPEECH & AUDIO

NLP

CoLLD: Contrastive Layer-to-Layer Distillation for Compressing Multilingual Pre-Trained Speech Encoders

Heng-Jui Chang, Ning Dong (AI), Ruslan Mavlyutov, Sravya Popuri, Andy Chung

April 14, 2024

March 05, 2024

SPEECH & AUDIO

Generative Pre-training for Speech with Flow Matching

Alex Liu, Matt Le, Apoorv Vyas, Bowen Shi, Andros Tjandra, Wei-Ning Hsu

March 05, 2024

Help Us Pioneer The Future of AI

We share our open source frameworks, tools, libraries, and models for everything from research exploration to large-scale production deployment.