Ken Shirakawa's Avatar

Ken Shirakawa

@kencan7749.bsky.social

Ph.D. candidate in Kyoto university and ATR/ Brain decoding / fMRI / neuroAI / neuroscience

22 Followers  |  43 Following  |  16 Posts  |  Joined: 16.11.2024
Posts Following

Posts by Ken Shirakawa (@kencan7749.bsky.social)

Post image

NSD-synthetic, the out-of-distribution companion dataset of NSD consisting of 7T fMRI responses to 284 artificial images, is now published.

#NeuroAI #CompNeuro #neuroscience #AI

doi.org/10.1038/s414...

12.02.2026 14:46 β€” πŸ‘ 24    πŸ” 14    πŸ’¬ 0    πŸ“Œ 0

This thread aligns closely with the core claim of our paper.

While naturalistic stimuli are highly valuable, large-scale natural data can yield spurious successes due to unintended shortcuts in complex analysis pipelines.

doi.org/10.1016/j.ne...

10.02.2026 02:20 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Redirecting

Our paper is now accepted at Neural Networks!

This work builds on our previous threads in X, updated with deeper analyses.

We revisit brain-to-image reconstruction using NSD + diffusion modelsβ€”and ask: do they really reconstruct what we perceive?

Paper: doi.org/10.1016/j.ne...
🧡1/12

13.06.2025 09:17 β€” πŸ‘ 3    πŸ” 3    πŸ’¬ 1    πŸ“Œ 1

And here’s an experimental podcast-style of paper summary, generated via Notebook LM directed by me!
Link: notebooklm.google.com/notebook/9c8...

13.06.2025 10:16 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

This project wouldn’t have been done without the support of all our lab members.
Huge thanks to co-authors, and especially to Prof. Kamitani ( @ykamit.bsky.social), for their invaluable support throughout this work!

13.06.2025 09:23 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Our paper goes further to formal analysis β€”including mathematical analysis, simulations, analysis of AI model representations, evaluation pitfalls, and meta-level insights into β€œrealistic” reconstruction.

If this thread sparked your interest, please take a look at our paper!

13.06.2025 09:22 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
A 7T fMRI dataset of synthetic images for out-of-distribution modeling of vision Large-scale visual neural datasets such as the Natural Scenes Dataset (NSD) are boosting NeuroAI research by enabling computational models of the brain with performances beyond what was possible just ...

NSD’s image diversity is smaller than expected, but this doesn’t diminish its value. New datasets like NSD-synthetic (arxiv.org/abs/2503.06286) and NSD-imagery (www.arxiv.org/abs/2506.06898) will also be valuable. Yet, we should consider data splits that align with your research goals.

13.06.2025 09:22 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

So, how should we interpret these reconstruction methods? We argue they’re better understood as visualizations of decoded content, not true reconstructions.
Visualization itself also has value, but it’s crucial to recognize the huge gap between visualization and reconstruction.

13.06.2025 09:21 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Taken together, our results suggest recent diffusion-based reconstructions are a mix of classification into trained categories and hallucination by generative AIs.
This deviates fundamentally from genuine visual reconstruction, which aims to recover arbitrary visual experiences.

13.06.2025 09:21 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

What about the Generator (diffusion model)?
We fed it true image features instead of predicted ones.
The outputs were semantically similarβ€”but perceptually quite different.
It seems the Generator relies mainly on semantic features, with less focus on perceptual fidelity.

13.06.2025 09:21 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image Post image

Given the overlap between training/test sets, can the Translator predict test stimuli effectively?

Careful identification analyses revealed a fundamental limitation in generalizing beyond the training distribution.

Translator, though a regressor, behaves more like a classifier.

13.06.2025 09:21 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image Post image

We first check Latent features. UMAP visualization of NSD’s CLIP features revealed (A):

- distinct clusters (~40)
- substantial overlap between training and test sets

NSD test images were also perceptually similar to training images (B), unlike in carefully curated Deeprecon (C).

13.06.2025 09:20 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

To better understand what was happening, we decomposed these methods into a Translator–Generator pipeline.

The Translator maps brain activity to the Latent features, and the Generator converts those features into images.

We analyzed each component in detail.

13.06.2025 09:19 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image Post image

We tested whether these methods generalize beyond NSD.
They worked well on NSD (A), but performance severely dropped on Deeprecon (B).
The latest MindEye2 even generated training-set categories unrelated to test stimuli.
So what’s behind this generalization failure?

13.06.2025 09:18 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

β€œReconstruction” is often seen as recovering any instance from a space of interest.

Prior works (e.g., Miyawaki+ 2008, Shen+ 2019) pursued this goal.

Recent studies report realistic reconstructions from NSD using CLIP + diffusion models.

Butβ€”do they truly achieve this goal?

13.06.2025 09:18 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Redirecting

Our paper is now accepted at Neural Networks!

This work builds on our previous threads in X, updated with deeper analyses.

We revisit brain-to-image reconstruction using NSD + diffusion modelsβ€”and ask: do they really reconstruct what we perceive?

Paper: doi.org/10.1016/j.ne...
🧡1/12

13.06.2025 09:17 β€” πŸ‘ 3    πŸ” 3    πŸ’¬ 1    πŸ“Œ 1

Yukiyasu Kamitani, Misato Tanaka, Ken Shirakawa
Visual Image Reconstruction from Brain Activity via Latent Representation
https://arxiv.org/abs/2505.08429

14.05.2025 07:10 β€” πŸ‘ 4    πŸ” 3    πŸ’¬ 0    πŸ“Œ 0

One big issue with some of the previous claims are that NSD, the massive 7T fMRI dataset of 1000s of images, might not be the right dataset to test these hypotheses. The reason is that it is built on MSCoCo and has too high similarity between training and test. arxiv.org/abs/2405.10078 16/n

11.12.2024 22:18 β€” πŸ‘ 6    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0

I’m currently concerned about what the brain’s encoding model predicts. Given that the target brain state is collected under naturalistic condition and the inputs of encoding model derived from a deep neural network, I am not sure what the predictions are actually represent.

16.11.2024 14:08 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Spurious reconstruction from brain activity Advances in brain decoding, particularly visual image reconstruction, have sparked discussions about the societal implications and ethical considerations of neurotechnology. As these methods aim to re...

arxiv.org/abs/2405.10078

16.11.2024 13:42 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0