Yingtian (David) Tang @davidtyt

UPDATE
project page: yingtiandt.github.io/dynamic-visi...

04.08.2025 03:27 — 👍 0 🔁 0 💬 0 📌 0

8/🖼️ Big Picture 
Optimizing to model world dynamics leads to brain-like representations. 

🧠 The visual system isn't a patchwork of modules — it’s a unified system built on shared core principles.

30.07.2025 13:39 — 👍 2 🔁 0 💬 1 📌 0

7/🧠 Finding 4
We introduce task-based functional localization. 
It:
1. Recovers many prior neuroscience results in a unified way
2. Reveals new structure in action understanding pathways

A novel scalable approach to functional brain mapping.

30.07.2025 13:37 — 👍 1 🔁 0 💬 1 📌 0

6/🌀 Finding 3
Putting observations together: 
• Single-objective models align with all regions and behaviors
• Cortex shows hybrid, smooth representation transitions

💡 A new perspective: the brain may implement a shared feature backbone — reused for diverse tasks, just like a “foundation model”.

30.07.2025 13:37 — 👍 1 🔁 0 💬 1 📌 0

5/🌐 Finding 2.2
These two aren’t isolated — they’re:
• Blended across ventral & dorsal streams 
• Smoothly mapped across the cortex

So, the visual system isn’t modular — it’s highly distributed, and the classic stream separation theory appears oversimplified.

30.07.2025 13:36 — 👍 1 🔁 0 💬 1 📌 0

4/🌐 Finding 2.1 
So, what does the brain actually compute during dynamic vision?

Across 10 cognitive tasks (e.g., pose, social cues, action), just two suffice to explain brain-like representations:
• Object form
• Appearance-free motion

30.07.2025 13:35 — 👍 2 🔁 0 💬 1 📌 0

3/📊 Finding 1 
✅ Dynamic models > static image models > classic vision models 
✅ Across both dorsal & ventral regions 
✅ Across neural & behavioral alignment

Best match to brain: V-JEPA.
In general, learning world dynamics give alignment to the whole visual system.

30.07.2025 13:34 — 👍 1 🔁 0 💬 1 📌 0

2/🧪 Approach 
We benchmarked diverse video models, each with a different pretraining objective. 

Then: tested how well they predict human fMRI responses to natural movies. 
🧠 ~10,000 voxels, whole visual system.

30.07.2025 13:34 — 👍 1 🔁 0 💬 1 📌 0

1/🔍 Motivation 
The brain is thought to process vision through two streams: 
🖼 Ventral — objects, form, identity 
🧭 Dorsal — motion, spatial layout, actions

Image models explain ventral well. But: what about dorsal? Can one model do both?

30.07.2025 13:33 — 👍 2 🔁 0 💬 1 📌 0

🚨 New research: Can the brain's complex visual system — ventral & dorsal processing streams — arise from a single goal?

We study dynamic vision and reveal how object and motion recognition — long thought to be separate — could emerge from the same underlying goal.

30.07.2025 13:32 — 👍 1 🔁 0 💬 1 📌 0

Many-Two-One: Diverse Representations Across Visual Pathways Emerge from A Single Objective How the human brain supports diverse behaviours has been debated for decades. The canonical view divides visual processing into distinct "what" and "where/how" streams – however, their origin and inde...

🧠 NEW PREPRINT
Many-Two-One: Diverse Representations Across Visual Pathways Emerge from A Single Objective
www.biorxiv.org/content/10.1...

30.07.2025 13:31 — 👍 27 🔁 11 💬 2 📌 5

Yingtian (David) Tang

Latest posts by davidtyt.bsky.social on Bluesky

@davidtyt is following 4 prominent accounts