April 28th, 2022
AI has made impressive strides in recent years, but it’s still far from learning language as efficiently as humans. For instance, children learn that “orange” can refer to both a fruit and color from a few examples, but modern AI systems can't do this nearly as efficiently as people. This has led many researchers to wonder: Can studying the human brain help to build AI systems that can learn and reason like people do?
Today, Meta AI is announcing a long-term research initiative to better understand how the human brain processes language. In collaboration with neuroimaging center Neurospin (CEA) and INRIA we’re comparing how AI language models and the brain respond to the same spoken or written sentences. We’ll use insights from this work to guide the development of AI that processes speech and text as efficiently as people. Over the past two years, we’ve applied deep learning techniques to public neuroimaging data sets to analyze how the brain processes words and sentences.
The data sets were collected and shared by several academic institutions, including Max Planck Institute for Psycholinguistics and Princeton University. Each institution collected and shared the data sets with informed consent from the volunteers in accordance with legal policies as approved by their respective ethical committees, including the consent obtained from the study participants.
Our comparison between brains and language models have already led to valuable insights:
Language models that most closely resemble brain activity are those that best predict the next word from context (e.g. once upon a …time). Prediction based on partially observable inputs is at the core of self-supervised learning (SSL) in AI and may be key to how people learn language.
However, we discover that specific regions in the brain anticipates words and ideas far ahead in time, while most language models today are typically trained to predict the very next word. Unlocking this long-range forecasting capability could help improve modern AI language models.
Of course, we’re only scratching the surface — there’s still a lot we don’t understand about how the brain functions, and our research is ongoing. Now, our collaborators at NeuroSpin are creating an original neuroimaging data set to expand this research. We’ll be open-sourcing the data set, deep learning models, code, and research papers resulting from this effort to help spur discoveries in both AI and neuroscience communities. All of this work is part of Meta AI’s broader investments toward human-level AI that learns from limited to no supervision
Our work is a part of the broader effort by the scientific community to use AI to better understand the brain. Neuroscientists have historically faced major limitations in analyzing brain signals — let alone comparing them with AI models. Studying neuronal activity and brain imaging is a time- and resource-intensive process, requiring heavy machinery to analyze neuronal activity, which is often opaque and noisy. Designing language experiments to measure brain responses in a controlled way can be painstaking too. For example, in classical language studies, sentences must match in complexity, and words must match frequency or number of letters, to allow a meaningful comparison of brain responses.
The rise of deep learning, where multiple layers of neural networks work together to learn, is rapidly alleviating these issues. This approach highlights where and when perceptual representations of words and sentences are generated in the brain when a volunteer reads or listens to a story.
Deep learning systems require a lot of data to ensure accuracy. Functional magnetic resonance imaging (fMRI) studies capture only a few snapshots of brain activities, typically from a small sample size. To meet the demanding quantity of data required for deep learning, our team not only models thousands of brain scans recorded from public data sets using fMRI but also simultaneously models them using magnetoencephalography (MEG), a scanner that takes snapshots of brain activity every millisecond — faster than a blink of an eye. In combination, these neuroimaging devices provide the large neuroimaging data necessary to detect where and in what order the activations take place in the brain. This is key to parsing the algorithm of human cognition.
In several studies, we’ve discovered the brain is systematically organized in a hierarchy that’s strikingly similar to AI language models (here, here, and here). For example, linguists have long predicted that language processing is characterized by a sequence of sensory and lexical computations, before words can be combined into meaningful sentences. Our comparison between deep language models and the brain precisely validates this sequence of computations. When reading a word, the brain first produces representations that are similar to deep convolutional networks trained to recognize characters in the early visual cortices. These brain activations are then transformed along the visual hierarchy into lexical representations akin to word embeddings. Finally, a distributed cortical network generates neural representations that correlates with the middle and final layers of deep language models. Deep learning tools have made it possible to clarify the hierarchy of the brain in ways that wasn’t possible before.
A systematic comparison between dozens of deep language models shows that the better they predict words from context, the more their representations correlate with the brain. We found this after analyzing brain activations of here, 200 volunteers in a simple reading task. A similar discovery was independently made by a team at MIT a week apart from ours, further validating this exciting direction. These similar studies provide reassurance that the AI community is on the right path with using SSL toward human-level AI.
But finding similarities isn’t enough to grasp the principles of language understanding. The computational differences between biological and artificial neural networks are key to improving existing models and building new more intelligent language models. Recently, we revealed evidence of long-range predictions in the brain, which is an ability that still challenges today’s language models. For instance, consider the phrase “Once upon a …” Most language models today would typically predict the next word, “time,” but their ability to anticipate complex ideas, plots, and narratives like people do is still limited.
To explore this issue, together with INRIA, we compared a variety of language models to the brain responses of 345 volunteers, who listened to complex narratives while being recorded with fMRI. We enhance those models with long-range predictions to track forecasts in the brain.
Our results show that specific brain regions, such as the prefrontal and parietal cortices, are best accounted for by language models enhanced with deep representations of far-off words in the future. These results shed light on the computational organization of the human brain and its inherently predictive nature and pave the way toward iimproving current AI models.
Overall, these studies support an exciting possibility — there are, in fact, quantifiable similarities between brains and AI models. And these similarities can help generate new insights about how the brain functions. This opens new avenues, where neuroscience will guide the development of more intelligent AI, and where, in turn, AI will help uncover the wonders of the brain.