Ole Goltermann's Avatar

Ole Goltermann

@olegolt.bsky.social

Doctoral Researcher @isnlab.bsky.social | part of Max Planck School of Cognition | previously @MPI_CBS, @MPI_NL & @univienna https://cognition.maxplanckschools.org/en/doctoral-candidates/ole-goltermann

179 Followers  |  230 Following  |  28 Posts  |  Joined: 09.11.2023  |  1.7256

Latest posts by olegolt.bsky.social on Bluesky

What do representations tell us about a system? Image of a mouse with a scope showing a vector of activity patterns, and a neural network with a vector of unit activity patterns
Common analyses of neural representations: Encoding models (relating activity to task features) drawing of an arrow from a trace saying [on_____on____] to a neuron and spike train. Comparing models via neural predictivity: comparing two neural networks by their R^2 to mouse brain activity. RSA: assessing brain-brain or model-brain correspondence using representational dissimilarity matrices

What do representations tell us about a system? Image of a mouse with a scope showing a vector of activity patterns, and a neural network with a vector of unit activity patterns Common analyses of neural representations: Encoding models (relating activity to task features) drawing of an arrow from a trace saying [on_____on____] to a neuron and spike train. Comparing models via neural predictivity: comparing two neural networks by their R^2 to mouse brain activity. RSA: assessing brain-brain or model-brain correspondence using representational dissimilarity matrices

In neuroscience, we often try to understand systems by analyzing their representations β€” using tools like regression or RSA. But are these analyses biased towards discovering a subset of what a system represents? If you're interested in this question, check out our new commentary! Thread:

05.08.2025 14:36 β€” πŸ‘ 84    πŸ” 30    πŸ’¬ 5    πŸ“Œ 0
Post image

Integrating and interpreting brain maps | doi.org/10.1016/j.ti...

Imaging and recording technologies make it possible to map multiple biological features of the brain. How can these features be conceptually integrated into a coherent understanding of brain structure and function? ‡️

04.08.2025 14:26 β€” πŸ‘ 53    πŸ” 32    πŸ’¬ 1    πŸ“Œ 0
Preview
Addressing artifactual bias in large, automated MRI analyses of brain development - Nature Neuroscience As large-scale neurodevelopmental MRI studies gain prominence, the authors identify tradeoffs between sample size and quality control that can dramatically affect results, and they evaluate a range of...

Analysis of >11,000 pediatric MRI scans suggests that suboptimal image quality may introduce bias in cortical thickness and area estimates in over half the cases πŸ€”πŸ§

www.nature.com/articles/s41...

04.08.2025 08:01 β€” πŸ‘ 22    πŸ” 7    πŸ’¬ 0    πŸ“Œ 1
Preview
No evidence for a link between mental health symptoms and pain thresholds Previous studies have suggested associations between pain perception and psychological factors such as mood, distress, fatigue, and quality of life. However, these factors and their relationship to...

Does sensitivity to acute pain correlate with mental health?
In our new paper led by @rebeccaboe.bsky.social & @francescafardo.bsky.social we analyzed thermal pain thresholds in 565 adults and found no link to mental health symptoms.
πŸ‘‰ doi.org/10.1080/1061...

04.08.2025 07:20 β€” πŸ‘ 52    πŸ” 9    πŸ’¬ 4    πŸ“Œ 0
Preview
Spatial specificity of the functional gradient echo and spin echo BOLD signal across cortical depth at 7 T Functional magnetic resonance imaging (fMRI) at high magnetic field strengths (β‰₯ 7 T) is a promising technique to study the functioning of the human brain at the spatial scale of cortical columns and ...

I’m happy to share that our manuscript on the cortical depth-dependency of the GE- and SE-BOLD point spread function at 7 Tesla is now available on bioRxiv!

www.biorxiv.org/content/10.1...

01.08.2025 13:38 β€” πŸ‘ 11    πŸ” 7    πŸ’¬ 1    πŸ“Œ 1
Post image

🚨 Fresh preprint w/ @helenblank.bsky.social!

How does the brain acquire expectations about a conversational partner, and how are priors integrated w/ sensory inputs?

Current evidence diverges. Is it prediction error? Sharpening?

Spoiler: It's both.πŸ‘€

🧡1/16

www.biorxiv.org/content/10.1...

01.08.2025 11:24 β€” πŸ‘ 13    πŸ” 6    πŸ’¬ 1    πŸ“Œ 1

This emphasizes that the reported test set is unrepresentative. Remarkably, even when their ensemble is built from the 10% worst-performing models based on training (average AUC = 0.48), rather than the 10% best, it still achieves an AUC of 0.88 on the test set.

02.08.2025 08:39 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Regarding our second concern and their additional analyses: What they actually demonstrate is that, regardless of how poorly the model performs during training, it consistently achieves an AUC of 0.88 on the test set.

02.08.2025 08:39 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Unfortunately, their reply does not address our first concern. It fails to explain why they report an AUC of 1.0 for a validation set, which by definition is not a proper validation set, and why a random seed of 23 was chosen for this analysis.

02.08.2025 08:39 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Correlation is not cognition.[1] Stop with the nonsense.

Everyday we slip further into the abyss. I often regret reading emails from other academics.

[1] Guest & @andreaeyleen.bsky.social (2023). On Logical Inference over Brains, Behaviour, and Artificial Neural Networks. doi.org/10.1007/s421...

31.07.2025 07:34 β€” πŸ‘ 71    πŸ” 15    πŸ’¬ 3    πŸ“Œ 1
Post image

Key modulatory regions such as DLPFC, pons & insula showed interactions between the two types of expectancy. The only region that showed common effects was the rostral anterior cingulate cortex, a region that has long been implicated in placebo and expectancy. 3/4

31.07.2025 00:28 β€” πŸ‘ 2    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0

Almost missed that this is out! Former postdoc Liz Necka led this long overdue FMRI study formally comparing two types of pain modulation: Placebo analgesia & predictive cues. TLDR: these are NOT the same! Placebo analgesia reduced cue effects, & brain mechanisms were nearly all dissociable. 1/4

31.07.2025 00:28 β€” πŸ‘ 26    πŸ” 10    πŸ’¬ 1    πŸ“Œ 0
Contrasting photographs of the night-time skylines of Manhattan (left) and Nijmegen (right), with matching genome-wide association plots underneath each.

Contrasting photographs of the night-time skylines of Manhattan (left) and Nijmegen (right), with matching genome-wide association plots underneath each.

Not sure who came up with "Manhattan Plot", but in 2014 I coined the alternative term "Nijmegen Plot" (inspired by the Dutch town where I live) to describe underwhelming results from our earliest genome-wide association scans of language/reading traits.

28.07.2025 16:41 β€” πŸ‘ 79    πŸ” 16    πŸ’¬ 1    πŸ“Œ 1
Preview
Large-scale genome-wide analyses of stuttering - Nature Genetics Genome-wide analyses in over one million self-reported cases and controls identify genetic variants associated with stuttering and find genetic correlations with autism, depression and impaired musica...

Exciting paper on genetic influences on speech fluency, out now in @natgenet.nature.com. In a big step forward for the field, genome scans of almost 100,000 people with self-reported stuttering, & 1 million controls, identify 57 associated loci. Great work by @piperbelow.bsky.social & her team.πŸ—£οΈπŸ§¬πŸ§ͺ

28.07.2025 11:39 β€” πŸ‘ 56    πŸ” 21    πŸ’¬ 2    πŸ“Œ 0
Preview
Assessing the predictive value of peak alpha frequency for... : PAIN eak alpha frequency (PAF) is, thus, discussed as a potential biomarker and novel target for neuromodulatory treatments of pain. Here, we scrutinized the generalizability of the relation between PAF an...

If you're interested in the predictive value of peak alpha frequency (PAF) for pain, I can recommend a paper by May et al. (2025, PAIN) from the @ploner.bsky.social lab. They assessed the prediction performance of PAF using a multiverse analysis approach.

shorturl.at/uR9GB

28.07.2025 12:45 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Brain Surfaces of 70 primate species

Brain Surfaces of 70 primate species

1
To predict the behaviour of a primate, would you rather base your guess on a closely related species or one with a similar brain shape? We looked at brains & behaviours of 70 species, you’ll be surprised!

🧡Thread on our new preprint with @r3rt0.bsky.social , doi.org/10.1101/2025...

27.07.2025 17:26 β€” πŸ‘ 447    πŸ” 206    πŸ’¬ 13    πŸ“Œ 23
Preview
Kann KI in der Medizin Leben retten, Simon Hofmann? Der Neurowissenschaftler Simon Hofmann untersucht, wie kΓΌnstliche Intelligenz in der Medizin zum Einsatz kommen kann.

Hilft kΓΌnstliche Intelligenz bei der #Demenz - Diagnose? Simon Hofmann @mpicbs.bsky.social erklΓ€rt im neuen #AchMensch Podcast, wie sich Γ„rzte in Zukunft durch #KI unterstΓΌtzen lassen kΓΆnnen und woran der Einsatz bislang scheitert detektor.fm/wissen/ach-m... #Medizin #Gehirn @detektorfm.bsky.social

09.07.2025 13:26 β€” πŸ‘ 9    πŸ” 3    πŸ’¬ 0    πŸ“Œ 0

Any time a paper reports an AUC of 1.0 on test data, my hackles are raised. This commentary by @olegolt.bsky.social @tspisak.bsky.social @christianbuchel.bsky.social nicely dissects problems with a recently reported "biomarker" for pain.

22.07.2025 16:05 β€” πŸ‘ 21    πŸ” 4    πŸ’¬ 1    πŸ“Œ 0
Post image 22.07.2025 16:55 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
GitHub - olegolt/PAF_reanalysis Contribute to olegolt/PAF_reanalysis development by creating an account on GitHub.

All our code, including further comments are available here:

github.com/olegolt/PAF_...

(13/13)

22.07.2025 15:24 β€” πŸ‘ 16    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0

In conclusion, we simply note that the joint probability of observing both the reported validation and test set AUCs is extremely low ~0.004% (1 in 25,000). We therefore would like to encourage everyone to read the paper, examine our reanalysis, and let us know what they think.

πŸ‘‡ 12/13

22.07.2025 15:24 β€” πŸ‘ 15    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

99% of the 1000 random splits yielded an AUC below 0.88. Only 10 reached that value - marking it clearly as an outlier, not a robust result. The chance of observing an AUC as low as 0.59 was about the same as hitting 0.88.

πŸ‘‡ 11/13

22.07.2025 15:24 β€” πŸ‘ 10    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

To demonstrate this, we repeated the train-test split 1,000 times, keeping all other analysis steps identical to those used by the original authors. Average AUC dropped to 0.74 (accuracy to 0.68), revealing that the reported AUC is not a robust measure of the model's performance (Figure 2).

πŸ‘‡ 10/13

22.07.2025 15:24 β€” πŸ‘ 10    πŸ” 0    πŸ’¬ 2    πŸ“Œ 0

(2) AUC OF 0.88 FOR THE TEST SET IS AN OUTLIER

The reported test set AUC of 0.88 comes from a single random train-test split, based on 38 subjects in the test set. We are aware that this was preregistered, but as our additional analyses show, this split does not reflect the data very well.

πŸ‘‡ 9/13

22.07.2025 15:24 β€” πŸ‘ 9    πŸ” 0    πŸ’¬ 2    πŸ“Œ 0

Instead of reporting those, the authors added this additional step of drawing 16 people again from the training set with a fixed seed of 23 (nowhere else used in their code) and report this metric as β€œvalidation” set AUC. This is both incorrect and misleading.

πŸ‘‡ 8/13

22.07.2025 15:24 β€” πŸ‘ 11    πŸ” 0    πŸ’¬ 1    πŸ“Œ 2
Post image

Our reanalysis showed that 99.6% of all possible subsets yield lower AUCs than the reported 1.0 (see Figure 1). A bit surprising to us, appropriate performance metrics were already available in their own code: average cross-validation AUC (0.65) and locked model AUC (0.73).

πŸ‘‡ 7/13

22.07.2025 15:24 β€” πŸ‘ 25    πŸ” 2    πŸ’¬ 1    πŸ“Œ 1

In principle, AUC on this subset should match the training set’s AUC of the locked model (0.73). But due to the small size, AUC varies widely depending on who’s included. Their AUC of 1.0 results from using a fixed random seed (23), which produced an unrepresentative AUC.

πŸ‘‡ 6/13

22.07.2025 15:24 β€” πŸ‘ 13    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

(1) THE AUC FOR THE VALIDATION SET IS INCORRECT

The so-called validation set (n=16) for the reported AUC of 1.0 was drawn from the training data (n=80) after model training - meaning it wasn’t independent and it does not provide a valid performance estimate of a β€œvalidation” set.

πŸ‘‡ 5/13

22.07.2025 15:24 β€” πŸ‘ 19    πŸ” 2    πŸ’¬ 2    πŸ“Œ 0

Motivated by this observation, we reviewed their code and reanalyzed the data, uncovering some issues that undermine the authors’ conclusions:

πŸ‘‡ 4/13

22.07.2025 15:24 β€” πŸ‘ 12    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0

We were impressed by the results but puzzled by the performance metrics: the winning model (logistic regression) shows an AUC of 0.65 on the training set, yet 1.0 on the validation and 0.88 on the test set (Figure 2B in their paper). Why does it perform worse on the data it was trained on?

πŸ‘‡ 3/13

22.07.2025 15:24 β€” πŸ‘ 12    πŸ” 1    πŸ’¬ 1    πŸ“Œ 2

@olegolt is following 20 prominent accounts