The Brain Tells a Story: Unveiling Distinct Representations of Semantic Content in Speech, Objects, and Stories in the Human Brain with Large Language Models

Abstract

In recent studies, researchers have utilized Large Language Models (LLMs) to investigate semantic representation within the brain. However, in many of these studies, the researchers often examined various semantic information contents separately, such as speech content, objects in scenes, and background stories. To quantitatively evaluate the contribution of various semantic contents in the brain, we recorded brain activity using functional magnetic resonance imaging (fMRI) while participants watched a total of 8.3 hours of videos of dramas or movies. Importantly, we densely annotated these videos at multiple semantic levels related to video contents, which allowed us to extract latent representations of LLMs for a range of semantic contents. We show that LLMs explain human brain activity more accurately than traditional language models, particularly for the high-level background story. Additionally, we show that distinct brain regions correspond to different semantic contents, thereby underscoring the importance of simultaneously modeling various levels of semantic contents. We will make our fMRI dataset publicly available for future research as a biological metric of the alignment between LLMs and humans.