The BLiMP-NL dataset consists of 84 Dutch minimal pair paradigms covering 22 syntactic phenomena, and comes with graded human acceptability ratings & self-paced reading times.
An example minimal pair:
A. Ik bekijk de foto van mezelf in de kamer (I watch the photograph of myself in the room; grammatical)
B. Wij bekijken de foto van mezelf in de kamer (We watch the photograph of myself in the room; ungrammatical)
Differences in human acceptability ratings between sentences correlate with differences in model syntactic log-odds ratio scores.
Next week Iโll be in Vienna for my first *ACL conference! ๐ฆ๐นโจ
I will present our new BLiMP-NL dataset for evaluating language models on Dutch syntactic minimal pairs and human acceptability judgments โฌ๏ธ
๐๏ธ Tuesday, July 29th, 16:00-17:30, Hall X4 / X5 (Austria Center Vienna)
24.07.2025 15:30 โ ๐ 28 ๐ 4 ๐ฌ 2 ๐ 2
I'm sharing a Colab notebook on using large language models for cognitive science! GitHub repo: github.com/MarcoCiappar...
It's geared toward psychologists & linguists and covers extracting embeddings, predictability measures, comparing models across languages & modalities (vision). see examples ๐งต
18.07.2025 13:39 โ ๐ 7 ๐ 3 ๐ฌ 1 ๐ 0
Many LM applications may be formulated as text generation conditional on some (Boolean) constraint.
Generate aโฆ
- Python program that passes a test suite.
- PDDL plan that satisfies a goal.
- CoT trajectory that yields a positive reward.
The list goes onโฆ
How can we efficiently satisfy these? ๐งต๐
13.05.2025 14:22 โ ๐ 10 ๐ 6 ๐ฌ 1 ๐ 0
The cerebellar components of the human language network
The cerebellum's capacity for neural computation is arguably unmatched. Yet despite evidence of cerebellar contributions to cognition, including language, its precise role remains debated. Here, we sy...
New paper! ๐ง **The cerebellar components of the human language network**
with: @hsmall.bsky.social @moshepoliak.bsky.social @gretatuckute.bsky.social @benlipkin.bsky.social @awolna.bsky.social @aniladmello.bsky.social and @evfedorenko.bsky.social
www.biorxiv.org/content/10.1...
1/n ๐งต
21.04.2025 15:19 โ ๐ 49 ๐ 21 ๐ฌ 2 ๐ 3
APA PsycNet
PINEAPPLE, LIGHT, HAPPY, AVALANCHE, BURDEN
Some of these words are consistently remembered better than others. Why is that?
In our paper, just published in J. Exp. Psychol., we provide a simple Bayesian account and show that it explains >80% of variance in word memorability: tinyurl.com/yf3md5aj
10.04.2025 14:38 โ ๐ 40 ๐ 15 ๐ฌ 1 ๐ 0
New preprint w/ @jennhu.bsky.social @kmahowald.bsky.social : Can LLMs introspect about their knowledge of language?
Across models and domains, we did not find evidence that LLMs have privileged access to their own predictions. ๐งต(1/8)
12.03.2025 14:31 โ ๐ 58 ๐ 16 ๐ฌ 2 ๐ 3
In conclusion, our results show that (1) LMs are broadly applicable models of the human language system across languages, and (2) there is a shared component in the processing of different languages (14/14)
04.02.2025 18:03 โ ๐ 2 ๐ 0 ๐ฌ 0 ๐ 0
Languages greatly vary in form, but there is a massive overlap in the concepts they can express. We speculate that this shared meaning space is responsible for successful encoding transfer, but weโll look more into this in future work (13/)
04.02.2025 18:03 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0
What supports the transfer of the encoding models? Form and meaning are two promising candidates. However, form-based (phonological, phonetic, syntactic) language similarity does not predict transfer performance (12/)
04.02.2025 18:03 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0
Encoding models trained on existing fMRI datasets successfully predicted responses in new languages, generalizing across stimuli types and modalities (11/)
04.02.2025 18:03 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0
In Study II, we tested transfer in a more stringent condition: training encoding models on existing fMRI datasets (sentence reading: Pereira2018, Tuckute2024; passage listening: Study I data, NatStories) and testing them on newly collected fMRI data in 9 new languages (10/)
04.02.2025 18:03 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0
In the โacrossโ condition, performance improves for models with stronger cross-lingual semantic alignment (where translations cluster together in the embedding space) (9/)
04.02.2025 18:03 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0
But what kind of model properties influence LM-to-brain alignment across languages?
In the โwithinโ condition, encoding performance is highest for models with good next-word prediction abilities (8/)
04.02.2025 18:03 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0
We also replicated in a cross-lingual setting the finding that the best fit to brain responses is obtained in intermediate-to-deep layers (for each subplot pair, the left one is โwithinโ, the right one โacrossโ) (7/)
04.02.2025 18:03 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0
We evaluated 20 multilingual LMs with different architectures and training objectives, and all of them were able to predict brain responses in the various languages (โwithinโ) and critically, generalized zero-shot to unseen languages (โacrossโ) (6/)
04.02.2025 18:03 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0
Critically, we fit two kinds of encoding models:
1๏ธโฃ โwithinโ encoding models, training and testing on data from a single language with cross-validation
2๏ธโฃ โacrossโ encoding models, training in N-1 languages and testing in the left-out language (5/)
04.02.2025 18:03 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0
In Study I, we:
1๏ธโฃ Present participants with auditory passages and record their brain responses in the language network
2๏ธโฃ Extract contextualized word embeddings from multilingual LMs
3๏ธโฃ Fit encoding models predicting brain activity from the embeddings (4/)
04.02.2025 18:03 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0
We address these questions through two studies, combining existing (12 languages, 24 participants) and newly collected fMRI data (9 languages, 27 participants). (3/)
04.02.2025 18:03 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0
We ask two core questions:
1๏ธโฃ Does the LM-brain alignment generalize to typologically diverse languages?
2๏ธโฃ Are brain representations similar across languages? (2/)
04.02.2025 18:03 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0
ๅนปใฎๅ
. MSc student @ University of Copenhagen
PhD Candidate @UCIrvine
Computational (Psycho)linguistics & Cognitive Science
https://shiupadhye.github.io/
Headed by Lucia Melloni @ae.mpg.de @ruhr-uni-bochum.de @nyu.edu.
We care about the brain, consciousness, cognition, & work actively towards a better science culture. This account is jointly run by lab members.
CS PhD Candidate at Stanford NeuroAI Lab
Developmental computational cognitive neuroscientist at Trinity College Dublin. We scan infants to understand the emergence of cognition, and how it is disrupted by brain injury. Director of the Trinity College Institute of Neuroscience.
Assemblymember. Democratic Nominee for Mayor of NYC. Running to freeze the rent, make buses fast + free, and deliver universal childcare. Democratic Socialist. zohranfornyc.com
PhD student in Interpretable Machine Learning at TU Berlin & BIFOLD
Postdoc @rhulpsychology.bsky.social. Interested in language and reading development across different writing systems.
NLP Researcher at ADAPT Centre | PhD
Machine Translation, Speech, LLMs
NLP/Computational Linguistics. Auteur driven films.
Postdoctoral Fellow at Harvard Kempner Institute. Trying to bring natural structure to artificial neural representations. Prev: PhD at UvA. Intern @ Apple MLR, Work @ Intel Nervana
Linguist, cognitive scientist at University of Stuttgart. I study language and how we understand it one word at a time.
cs phd student and kempner institute graduate fellow at harvard.
interested in language, cognition, and ai
soniamurthy.com
research tech @ relcog lab, uci | ombbhatt.github.io | that blue pi in 3b1b videos is my spirit animal
We are neuroscientists and psychologists at MIT who love to learn about kids' brains and help kids learn about their own brains!
PI: Rebecca Saxe
Lab: https://saxelab.mit.edu/
PhD student in the Object Vision Group at CIMeC, University of Trento. He/him ๐ณ๏ธโ๐
https://davidecortinovis-droid.github.io/
I post mainly about Neuroscience, Machine Learning, Complex Systems, or Stats papers.
Working on neural learning /w @auksz.bsky.social @ccnberlin.bsky.social /BCCN/Free Univ Berlin.
I also play bass in a pop punk band:
https://linktr.ee/goodviewsbadnews
1st year PhD Student at @gronlp.bsky.social ๐ฎ - University of Groningen
Language Acquisition - NLP
PhD Student in Cognitive Neuroscience @doellerlab.bsky.social
Professor at UW-Madison, parent, studies how babies learn, loves food, dogs, and politics.