Sam Nastase's Avatar

Sam Nastase

@samnastase.bsky.social

postdoc/lecturer at Princeton丨he/him丨semiprofessional dungeon master丨https://snastase.github.io/

2,358 Followers  |  717 Following  |  60 Posts  |  Joined: 08.09.2023  |  2.1597

Latest posts by samnastase.bsky.social on Bluesky

Title page of SCIL extended abstract titled: semantic-features: A User-Friendly Tool for Studying Contextual Word Embeddings in Interpretable Semantic Spaces

Title page of SCIL extended abstract titled: semantic-features: A User-Friendly Tool for Studying Contextual Word Embeddings in Interpretable Semantic Spaces

I will unfortunately have to skip SCiL this year, but I am thrilled to share that Jwalanthi will be presenting this work by her, @rjha.bsky.social, me, and @kmahowald.bsky.social on a tool that allows you to project contextualized embeddings from LMs to interpretable semantic spaces!

17.07.2025 17:47 — 👍 15    🔁 4    💬 1    📌 1
Music-evoked reactivation during continuous perception is associated with enhanced subsequent recall of naturalistic events Music is a potent cue for recalling personal experiences, yet the neural basis of music-evoked memory remains elusive. We address this question by using the full-length film Eternal Sunshine of the Spotless Mind to examine how repeated musical themes reactivate previously encoded events in cortex and shape next-day recall. Participants in an fMRI study viewed either the original film (with repeated musical themes) or a no-music version. By comparing neural activity patterns between these groups, we found that music-evoked reactivation of neural patterns linked to earlier scenes in the default mode network was associated with improved subsequent recall. This relationship was specific to the music condition and persisted when we controlled for a proxy measure of initial encoding strength (spatial intersubject correlation), suggesting that music-evoked reactivation may play a role in making event memories stick that is distinct from what happens at initial encoding. ### Competing Interest Statement The authors have declared no competing interest. National Institutes of Health, https://ror.org/01cwqze88, F99 NS118740, R01 MH112357

Music is an incredibly powerful retrieval cue. What is the neural basis of music-evoked memory reactivation? And how does this reactivation relate to later memory for the retrieved events? In our new study, we used Eternal Sunshine of the Spotless Mind to find out. www.biorxiv.org/content/10.1...

08.07.2025 14:05 — 👍 52    🔁 21    💬 1    📌 5
Post image

Finally, we developed a set of interactive tutorials for preprocessing and running encoding models to get you started. Happy to hear any feedback or field any questions about the dataset! hassonlab.github.io/podcast-ecog...

07.07.2025 21:00 — 👍 7    🔁 2    💬 0    📌 0
Post image

We validated both the data and stimulus features using encoding models, replicating previous findings showing an advantage for LLM embeddings.

07.07.2025 21:00 — 👍 1    🔁 0    💬 1    📌 0
Post image

We also provide word-level transcripts and stimulus features ranging from low-level acoustic features to large language model embeddings.

07.07.2025 21:00 — 👍 0    🔁 0    💬 1    📌 0
Post image

We recorded ECoG data in nine subjects while they listened to a 30-minute story. We provide a minimally preprocessed derivative of the raw data, ready to be used.

07.07.2025 21:00 — 👍 1    🔁 0    💬 1    📌 0
Post image

Check out Zaid's open "Podcast" ECoG dataset for natural language comprehension (w/ Hasson Lab). The paper is now out at Scientific Data (nature.com/articles/s41...) and the data are available on OpenNeuro (openneuro.org/datasets/ds0...).

07.07.2025 21:00 — 👍 35    🔁 17    💬 1    📌 0
Preview
Brains and language models converge on a shared conceptual space across different languages Human languages differ widely in their forms, each having distinct sounds, scripts, and syntax. Yet, they can all convey similar meaning. Do different languages converge on a shared neural substrate f...

We're really excited to share this work and happy to hear any comments or feedback!
Preprint: arxiv.org/abs/2506.20489
Code: github.com/zaidzada/cro...

30.06.2025 20:56 — 👍 1    🔁 0    💬 0    📌 0

These findings suggest that, despite the diversity of languages, shared meaning emerges from our interactions with one another and our shared world.

30.06.2025 20:56 — 👍 3    🔁 2    💬 1    📌 0

Our results suggest that neural representations of meaning underlying different languages are shared across speakers of various languages, and that LMs trained on different languages converge on this shared meaning.

30.06.2025 20:56 — 👍 1    🔁 0    💬 1    📌 0
Post image

We then tested the extent to which each of these 58 languages can predict the brain activity of our participants. We found that languages that are more similar to the listener’s native language, the better the prediction:

30.06.2025 20:56 — 👍 1    🔁 0    💬 1    📌 0
Post image

What about multilingual models? We translated the story from English to 57 other languages spanning 14 families, and extracted embeddings for each from multilingual BERT. We visualized the dissimilarity matrix using MDS and found clusters corresponding to language family types.

30.06.2025 20:56 — 👍 1    🔁 0    💬 1    📌 0
Post image

We found that models trained to predict neural activity for one language generalize to different subjects listening to the same content in a different language, across high-level language and default-mode regions.

30.06.2025 20:56 — 👍 1    🔁 0    💬 1    📌 0

We then used the encoding models trained on one language to predict the neural activity in listeners of other languages.

30.06.2025 20:56 — 👍 0    🔁 0    💬 1    📌 0

We then aimed to find if a similar shared space exists in the brains of native speakers of the three different languages. We used voxelwise encoding models that align the LM embeddings with brain activity from one group of subjects listening to the story in their native language.

30.06.2025 20:56 — 👍 0    🔁 0    💬 1    📌 0
Post image

We extracted embeddings from three unilingual BERT models—trained on entirely different languages)—and found that (with a rotation) they converge onto similar embeddings, especially in the middle layers:

30.06.2025 20:56 — 👍 0    🔁 0    💬 1    📌 0
Post image

We used naturalistic fMRI and language models (LMs) to identify neural representations of the shared conceptual meaning of the same story as heard by native speakers of three languages: English, Chinese, and French.

30.06.2025 20:56 — 👍 0    🔁 0    💬 1    📌 0
Post image

We hypothesized that the brains of native speakers of different languages would converge on the same supra-linguistic conceptual structures when listening to the same story in their respective languages:

30.06.2025 20:56 — 👍 0    🔁 0    💬 1    📌 0
Post image

Previous research has found that language models trained on different languages learn embedding spaces with similar geometry. This suggests that internal geometry of different languages may converge on similar conceptual structures:

30.06.2025 20:56 — 👍 0    🔁 0    💬 1    📌 0

There are over 7,000 human languages in the world and they're remarkably diverse in their forms and patterns. Nonetheless, we often use different languages to convey similar ideas, and we can learn to translate from one language to another.

30.06.2025 20:56 — 👍 0    🔁 0    💬 1    📌 0
Post image

How do different languages converge on a shared neural substrate for conceptual meaning? Happy to share a new preprint led by Zaid Zada that specifically addresses this question:

30.06.2025 20:56 — 👍 34    🔁 5    💬 1    📌 0

We're very excited to share this work and happy to hear your feedback! If you're attending #OHBM2025, @ahmadsamara.bsky.social will be presenting a poster on this project (poster #804, June 25 and 26)—be sure to stop by and chat with him about it!
www.biorxiv.org/content/10.1...

24.06.2025 23:25 — 👍 1    🔁 1    💬 0    📌 0

Our findings suggest that different language areas are coupled via a mixture of linguistic features—this yields what we refer to as a "soft hierarchy" from lower-order to higher-order language areas, and may facilitate efficient, context-sensitive language processing.

24.06.2025 23:25 — 👍 0    🔁 0    💬 1    📌 0
Post image

Taking a slightly different approach, we assess how well specific model features capture larger-scale patterns of connectivity. We find that feature-specific model connectivity partly recapitulates stimulus-driven cortical network configuration.

24.06.2025 23:25 — 👍 0    🔁 0    💬 1    📌 0
Post image

We observe a clear progression of feature-specific connectivity from early auditory to lateral temporal areas, advancing from acoustic-driven connectivity to speech- and finally language-driven connectivity.

24.06.2025 23:25 — 👍 0    🔁 0    💬 1    📌 0
Post image

We show that early auditory areas are coupled to intermediate language areas via lower-level acoustic and speech features. In contrast, higher-order language and default-mode regions are predominantly coupled through more abstract language features.

24.06.2025 23:25 — 👍 0    🔁 0    💬 1    📌 0
Post image

We developed a model-based framework for quantifying stimulus-driven, feature-specific connectivity between regions. We used parcel-wise encoding models to align feature-specific embeddings to brain activity and then evaluated how well these models generalize to other parcels.

24.06.2025 23:25 — 👍 0    🔁 0    💬 1    📌 0

Put differently, ISFC can tell us *where* and *how much* connectivity is driven by the stimulus, but not *what* stimulus features are driving the connectivity. How can we begin to unravel what linguistic features are shared across different language regions?

24.06.2025 23:25 — 👍 0    🔁 0    💬 1    📌 0
Post image

Following the logic of intersubject correlation (ISC) analysis, intersubject functional connectivity (ISFC) isolates stimulus-driven connectivity between regions (e.g., in response to naturalistic stimuli)—but is agnostic to the content of the stimulus shared between regions.

24.06.2025 23:25 — 👍 0    🔁 0    💬 1    📌 0

Traditional within-subject functional connectivity metrics cannot distinguish between intrinsic and extrinsic (i.e., stimulus-driven) co-fluctuations between brain regions.

24.06.2025 23:25 — 👍 0    🔁 0    💬 1    📌 0

@samnastase is following 19 prominent accounts