RESEARCH

NLP

Information-Theoretic Probing for Linguistic Structure

June 26, 2020

Abstract

The success of neural networks on a diverse set of NLP tasks has led researchers to question how much these networks actually "know" about natural language. Probes are a natural way of assessing this. When probing, a researcher chooses a linguistic task and trains a supervised model to predict annotations in that linguistic task from the network's learned representations. If the probe does well, the researcher may conclude that the representations encode knowledge related to the task. A commonly held belief is that using simpler models as probes is better; the logic is that simpler models will identify linguistic structure, but not learn the task itself. We propose an information-theoretic operationalization of probing as estimating mutual information that contradicts this received wisdom: one should always select the highest performing probe one can, even if it is more complex, since it will result in a tighter estimate, and thus reveal more of the linguistic information inherent in the representation. The experimental portion of our paper focuses on empirically estimating the mutual information between a linguistic property and BERT, comparing these estimates to several baselines. We evaluate on a set of ten typologically diverse languages often underrepresented in NLP research—plus English—totaling eleven languages.

Download the Paper

AUTHORS

Written by

Adina Williams

Joseph Valvoda

Ran Zmigrod

Rowan Hall Maudsley

Ryan Cotterell

Tiago Pimentel

Publisher

ACL

Related Publications

July 23, 2024

HUMAN & MACHINE INTELLIGENCE

CONVERSATIONAL AI

The Llama 3 Herd of Models

Llama team

July 23, 2024

June 25, 2024

NLP

Neurons in Large Language Models: Dead, N-gram, Positional

Elena Voita, Javier Ferrando Monsonis, Christoforos Nalmpantis

June 25, 2024

June 25, 2024

SPEECH & AUDIO

NLP

Textless Acoustic Model with Self-Supervised Distillation for Noise-Robust Expressive Speech-to-Speech Translation

Min-Jae Hwang, Ilia Kulikov, Benjamin Peloquin, Hongyu Gong, Peng-Jen Chen, Ann Lee

June 25, 2024

June 14, 2024

NLP

How to Train Your DRAGON: Diverse Augmentation Towards Generalizable Dense Retrieval

Sheng-Chieh Lin, Akari Asai, Minghan Li, Barlas Oguz, Jimmy Lin, Scott Yih, Xilun Chen

June 14, 2024

Help Us Pioneer The Future of AI

We share our open source frameworks, tools, libraries, and models for everything from research exploration to large-scale production deployment.