#CoreCognition #LLM #multimodal #GrowAI We spent 3 years to curate 1503 classic experiments spanning 12 core concepts in human cognitive development and evaluated on 230 MLLMs with 11 different prompts for 5 times to get over 3.8 millions inference data points.
A thread (1/n) - #ICML2025 ✅
30.06.2025 06:07 — 👍 13 🔁 9 💬 1 📌 0
Beautiful to see this initiative from a group of like minded PhD students collaborating together! 🚀
11.06.2025 23:49 — 👍 9 🔁 4 💬 1 📌 0
GrowAI Team: @growai.bsky.social
12.06.2025 17:04 — 👍 1 🔁 0 💬 0 📌 0
New Paper Alert ‼️ Current VLMs completely fail human gaze understanding 🙀 and scaling does NO help ‼️
However, humans, since an extremely age 🧒, are extremely sensitive to other people's gaze 🙄 👀
No mentors, no labs, only pre-doc students, 111 VLMs, and we did it 😎
11.06.2025 23:21 — 👍 6 🔁 5 💬 1 📌 1
With the amazing GrowAI team: Pinyuan Feng (equally contributed), Bingyang Wang, Tianwei Zhao, Suyang Yu, Qingying Gao, @hokin.bsky.social , Ziqiao Ma, Yijiang Li, & Dezhi Luo.
🧵11/11 🎉
12.06.2025 17:03 — 👍 2 🔁 0 💬 1 📌 0
Besides understanding VLMs, this explanation also suggests that VLM training should include more embodied social interaction, such that natural human-AI interaction can stem from next-token/frame-prediction training. We also recommend a better learning curriculum design📚.
🧵9/11
12.06.2025 17:03 — 👍 2 🔁 0 💬 1 📌 0
We leave this explanation open for further investigation and conclude that this work shows how controlled studies can complement benchmarking by providing aspects that explanations need to account for, as a way to constrain the hypothesis space to better understand VLMs🌟.
🧵8/11
12.06.2025 17:03 — 👍 2 🔁 0 💬 1 📌 0
Surprisingly, their accuracy does not differ between front views and side views, while humans do (p<0.001). VLMs may rely on 👺head orientation rather than 👀eye gaze direction, making them "robust" to side views that increase the geometric ambiguity of eye direction.
🧵7/11
12.06.2025 17:03 — 👍 2 🔁 0 💬 1 📌 0
On the other hand, the performance of Gemini 1.5 Pro, GPT-4o, InternLM, Qwen2.5, and GLM becomes closer to the chance level as difficulty increases (with increasing proximity and number of objects). They likely employ heuristics that break down under difficult conditions.
🧵6/11
12.06.2025 17:03 — 👍 2 🔁 0 💬 1 📌 0
Before that, we need to establish baselines. 65 human participants were presented with MC questions like the one below. Their performance degrades 📉 with increasing proximity, increasing number of objects, and when the camera view switches from the front to the side.
🧵5/11
12.06.2025 17:03 — 👍 2 🔁 0 💬 1 📌 0
In addition to the chance-level accuracy, VLMs responded with every possible answer almost equally frequently. Are they random guessers? 🤡 Spoiler: top-tier VLMs are not, as we further analyzed how their performance varies with respect to the controlled variables. 🤗
🧵4/11
12.06.2025 17:03 — 👍 2 🔁 0 💬 1 📌 0
We found that humans excel at gaze inference (~91% accuracy), but 94 of 111 VLMs performed about as well as if they had guessed randomly without looking at the images (~42%) 😲. Even the best, like GPT-4o, hit only ~50%. Bigger (or newer) VLMs are not better. 🫤
🧵3/11
12.06.2025 17:03 — 👍 2 🔁 0 💬 1 📌 0
We systematically manipulated variables across 900 evaluation stimuli: View (left/right/front), Proximity (1-3 scale), Number of objects (2-4), etc., and tested 65 human participants (45 stimuli per person) and 111 VLMs on it.
🧵2/11
12.06.2025 17:03 — 👍 2 🔁 0 💬 1 📌 0
👁️ 𝐂𝐚𝐧 𝐕𝐢𝐬𝐢𝐨𝐧 𝐋𝐚𝐧𝐠𝐮𝐚𝐠𝐞 𝐌𝐨𝐝𝐞𝐥𝐬 (𝐕𝐋𝐌𝐬) 𝐈𝐧𝐟𝐞𝐫 𝐇𝐮𝐦𝐚𝐧 𝐆𝐚𝐳𝐞 𝐃𝐢𝐫𝐞𝐜𝐭𝐢𝐨𝐧?
Knowing where someone looks is key to a Theory of Mind. We test 111 VLMs and 65 humans to compare their inferences.
Project page: grow-ai-like-a-child.github.io/gaze/
🧵1/11
12.06.2025 17:03 — 👍 3 🔁 0 💬 1 📌 1
Sam is 100% correct on this. Indeed, human babies have essential cognitive priors such as permanence, continuity, and boundary of objects, 3D Euclidean understanding of space, etc.
We spent 2 years to systematically to examine and show the lack of such in MLLMs: arxiv.org/abs/2410.10855
24.05.2025 05:55 — 👍 21 🔁 5 💬 0 📌 0
Writes about cognitive science and philosophy. Professes psychology at Princeton University. Devours chocolate and fiction. www.tanialombrozo.com
Developmental psychologist studying children's thinking and learning | Prof @ UTD | @cogdevsoc secretary
CS PhD Student @University of Washington, CSxPhilosophy @Dartmouth College
Interested in MARL, Social Reasoning, and Collective Decision making in people, machines, and other organisms
kjha02.github.io
Senior Lecturer in Psychology | comparative cognition & cognitive development research | open science | kidney transplant recipient
I make earrings: https://tinyurl.com/dottysparrow
Research Professional @stanfordpsych LangCog Lab🧠
Stanford ‘22🌲· Variability in development · Context & Cognition · Early Learning · #FirstGen, 🏳️🌈, he/him rbzsparks.github.io
https://growing-ai-like-a-child.github.io/
Psychologist who studies and writes about human nature—including morality, pleasure, and religion. Sustack: https://smallpotatoes.paulbloom.net/
Cognitive neuroscientist.
Professor at College de France in Paris.
Head of the NeuroSpin brain imaging facility in Saclay.
President of the Scientific Council of the French national education ministry (CSEN)
Philosopher, Scientist, Engineer
https://hokindeng.github.io/
Official account of the Boston University Conference on Language Development (BUCLD), organized by the students and faculty of @bulinguistics.bsky.social since 1976.
BUCLD 50 will take place on November 6–9, 2025!
Web: https://www.bu.edu/bucld
Children, Dogs, Monkeys, Robots. Associate Professor at Brown University.
https://sites.brown.edu/cocodevlab/
https://scholar.google.com/citations?user=4LRxXyIAAAAJ&hl=en
I reached out just for the connection
but when she wove her hand in mine
my rushing heart beat out a code
now writ behind my eyes
an order cut into my dreams
to blacken out the skies
I post about computational cognitive science (she/her 🧡🤍💜)
Postdoc in comparative psychology, biology, & neuroscience @YorkU & @Harvard | Interested in individual differences and life experiences shaping canine social behaviour | #ManyDogs Founder & Co-Director | My job is just dog 🐶 | she/her/ella #LatinaInSTEM
Speech • Language • Learning
https://grzegorz.chrupala.me
@ Tilburg University
Linguist in AI & CogSci 🧠👩💻🤖 PhD student @ ILLC, University of Amsterdam
🌐 https://mdhk.net/
🐘 https://scholar.social/@mdhk
🐦 https://twitter.com/mariannedhk
Full professor of inclusive speech communication at TU Delft, The Netherlands. Former president of the International Speech Communication Association (ISCA). General Chair of @interspeech.bsky.social Rotterdam, 2025. Mother of 3🌈
Linguist. Inclusive speech tech. Cats.
Welcome to the 26th Interspeech Conference, the premier global event on spoken language processing technology, held in August 17-21, 2025, in Rotterdam, NL.
Associate professor at Université de Lorraine. Doing research is speech and audio processing.
Assistant professor at Cornell Psychology Department. CoCoCo Lab (Cornell Computational Cognition Lab) @co3lab.bsky.social. I am recruiting!