Q&A with Marco Baroni, winner of the ACL 2020 Test of Time award with Alessandro Lenci

July 22, 2020

We are pleased to congratulate Facebook AI’s Marco Baroni and the University of Pisa’s Alessandro Lenci on receiving the ACL 2020 Test of Time award for their paper "Distributional Memory: A General Framework for Corpus-Based Semantics," which appeared in the journal of Computational Linguistics in 2010.

Baroni joined Facebook AI in 2016, inspired by the opportunity to work and exchange ideas with research scientists like Tomas Mikolov, Armand Joulin, Jason Wetson, and Antoine Bordes, all of whom bring a fresh perspective to the computational modeling of language.

Baroni’s work in the areas of multimodal and compositional distributed semantics have received widespread recognition, including an ERC Starting Grant and the ICAI-JAIR Best Paper prize. At Facebook AI, he focuses on training machines to interact with humans and with other machines through natural language. Currently, he’s working on better understanding the nature of the languages that evolve when artificial neural networks communicate with each other.

Baroni took a moment to share his thoughts about the impact of the 2010 paper.

Q: What was the research about?

A: In contrast to previous models that were tuned to solve one task at a time, our research proposes an alternative approach in which computational word representations are extracted once and for all from a corpus of text. These representations are then used to tackle a range of tasks, such as automatically discovering synonyms or predicting the typical properties of a concept.

You can read more about the research itself here.

Q: What happened when the paper was originally published, and how was it received in the community then?

A: The paper was well received, especially by a small community of researchers interested in bridging computational linguistics with theoretical linguistics and cognitive science. Its (relative) success was in part due to us making the representations we created available with the paper’s release. While this seems like an obvious step now, at the time it wasn’t that common.

Q: How has the work been built upon? Is there any impact of this work in products we see today?

A: This paper resonates with at least two topics that are very prominent today. One is the idea of developing general-purpose word and sentence representations through what is called a pretraining stage. This is at the core of many natural language processing applications, such as machine translation and automated question answering. Interestingly, at the time, researchers in the field of machine learning (ML) independently discovered general-purpose pretraining more or less in parallel.

The other is the idea of creating a battery of standard tests to probe the linguistic abilities of a computational model. Nowadays, there is a very active community working just on this idea, often associated with the BlackBox NLP workshop series. There is now a widespread consensus that simply measuring the performance of an NLP system on practical tasks is not sufficient enough to understand the genuine linguistic abilities of the model.

Q: Were there any surprises along the way?

A: At the end of 2010, at the EMNLP conference (Conference on Empirical Methods in Natural Language Processing), I heard people talking about “deep learning” and “representation learning” for the first time. It took me a while to understand what these terms meant, but eventually I discovered that many ML researchers were working on ideas similar to those being developed in the niche computational linguist community I belonged to.

While we came from different disciplinary traditions and had different goals, we were reaching the same conclusions. For example, the idea that you could capture many forms of knowledge with continuous, rather than discrete, representations, and that these representations should be automatically extracted from data rather than coded by hand.

I am a theoretical linguist by training, and it was very surprising to find out that what I was doing was very relevant to ML experts, and that, conversely, I could understand what they were doing and why.

Q: What is your current focus?

A: Humans can accomplish amazing things, thanks to our ability to communicate with each other through language. Can we similarly empower current AI systems with an ability to “talk” to each other? If we let them evolve a shared language to solve problems together, what kind of characteristics will the emergent language have? Is it as flexible as human language? If not, how can we make it so? And what can this teach us about both humans and machines?