3 of the researchers were participants... I know I wouldn't have been able to resist, either! Suit me up with my custom-printed barn-owl ears, please!
12.02.2026 18:41 β π 1 π 0 π¬ 0 π 0@thenewstats.bsky.social
Open science, estimation statistics, and random thoughts from Bob Calin-Jageman and Geoff Cumming. https://thenewstatistics.com/itns/
3 of the researchers were participants... I know I wouldn't have been able to resist, either! Suit me up with my custom-printed barn-owl ears, please!
12.02.2026 18:41 β π 1 π 0 π¬ 0 π 0Why did they do this? To find out if the human auditory system is flexible enough to learn to localize sounds vertically, the way barn owls do. The answer: partly! But honestly, I'm just excited that they *did* this.
12.02.2026 18:41 β π 1 π 0 π¬ 1 π 0Figure from a research paper showing custom-printed barn-owl molds fitted to a human research participant. Don't worry if you can't see them -- they are not nearly as cool as the phrase 'barn-owl ears' would have you expecting.
Scientists created prosthetic barn-owl ears and got human participants to wear them for 3-59 days.
Yes, people walked around with barn-owl ears for up to 2 months *for science*.
This is the type of news I need today to restore a bit of my faith in humanity.
www.biorxiv.org/content/10.6...
Because traditional NHST doesn't allow the skeptic to win, we've maligned the need to do so: uninteresting, just mistakes, can't prove a negative, etc. But this is just rationalizing our way into accepting the limitations of the tool. Fruitful science must be able to rule in and rule out effects!
12.02.2026 18:33 β π 0 π 0 π¬ 0 π 0"near 0" findings have been just essential for science as "not 0" findings. Pasteur, for example, showed that quantity of bacteria formed under sterile conditions is near 0, disproving the theory of spontaneous generation.
12.02.2026 18:33 β π 1 π 0 π¬ 1 π 0So, just tests with results that we like?
I think you are confusing "effect is near 0" with uninteresting. Not true! Near 0 is interesting (and vital to publish) when a leading theory predicted an effect. It's interesting (and vital) to producing parsimonious models.
Personally, I think we need all valid studies published. What we need (I think) is to think more critically about quality/validityβ it is not about p value obtained or even ci widthβ it is evaluated from construct validity, positive and negative controls, design features, overlapping evidence, etc.
12.02.2026 14:44 β π 0 π 0 π¬ 1 π 0Can you clarify this a bit? Meaning, you donβt think wide cis (unclear) should be published but narrow ones (clear) should be? Or something else?
12.02.2026 14:40 β π 0 π 0 π¬ 1 π 0A valid testing regime has the possible outcomes: yes, no, unclear. It makes no sense for a scientific literature to have only βyesβ results. We support this regime with jaundiced ideas about null results that on reflection are poor epistemology.
12.02.2026 13:32 β π 0 π 0 π¬ 1 π 0I really donβt understand the jaundiced view of validly estimating no effect (βnull resultsβ). Donβt detectives work by ruling out suspects? Isnβt the greater part of learning weakening unhelpful associations? Donβt we accept Popperβs view that science advances through falsification?
12.02.2026 13:26 β π 0 π 0 π¬ 1 π 0You run a study because you believe z is an important factor, your results estimate is is negligible.. that is new and important knowledge. Theory A and Theory B make contrasting predictions (large effect, no effect), data estimating a negligible difference helps arbitrate between these.
12.02.2026 13:20 β π 0 π 0 π¬ 1 π 0Judgement of quality/validity should not be based on if we like the results. Crap methods/ideas can diminish effects leading to p > .05 or inflate effects producing p < .05. Many studies do belong in the file drawer, but that p value generated is not a valid guide.
12.02.2026 13:14 β π 0 π 0 π¬ 0 π 0Sterling, 1959 updated. We still arenβt sharing science in a scientific manner. With 65 years of knowing the problem, how are we still struggling to address it?
11.02.2026 17:45 β π 3 π 0 π¬ 0 π 0I havenβt! Seems like a cool idea, though likely a good bit harder than p, df, and test stat correspondence.. but would be cool!
29.01.2026 15:34 β π 1 π 1 π¬ 0 π 0For Python: github.com/ACCLAB/dabestr
For R: github.com/ACCLAB/DABES...
So excited to see DaBest 2.0 is out: get bootstrapped estimation statistics for simple through complex designs, all with beautiful visualization, available in R and Python.
Check it out!
Pre-print describing new features for complex designs: www.biorxiv.org/content/10.6...
#stats
Looking forward to reading this retrospective on the promise and perils of organizing an RRR with in-person data collected.
From: @aggieerin.bsky.social , @psforscher.bsky.social , and others.
#stats
doi.org/10.1111/spc3...
Comparing registrations to published papers is essential to research integrity - and almost no one does it routinely because it's slow, messy, and time-demanding.
RegCheck was built to help make this process easier.
Today, we launch RegCheck V2.
π§΅
regcheck.app
esci 1.0.9 is now on CRAN.
No big changes, just a couple of bug fixes and some compatibility changes for statpsych 1.9. Still a great easy way to get effect sizes and fair tests for many common designs.
cran.r-project.org/web/packages...
#stats
thenewstatistics.com/itns/2026/01...
A statistics textbook for the AI era
thenewstatistics.com/itns/2026/01...
16.01.2026 22:40 β π 2 π 0 π¬ 0 π 1The Faculty for Undergraduate Teaching Workshop is back this summer, July16-19, right outside of Chicago. A great conference with great people. Registration and abstract submission are openβ¦ it is going to be great!
#neuroscience
www.funfaculty.org/conference_2...
Proud to have played a small part in this as a data-collection site. And so proud of @luis-a-gomez.bsky.social , who organized this project at our university as an undegrad and has launched to great success in a clinical psych PhD program at Purdue.
14.01.2026 14:31 β π 1 π 0 π¬ 0 π 0The stereotype-threat effect among women was virtually null and not significant (N = 1275, g = 0.04, SE = 0.08, 95% CI [-0.11; 0.20]), and considerably smaller than the original study (N = 45, d = -0.82; 95% CI [-1.45; -0.18]).
Another threat to stereotype threat?
An RRR (in press, AMPPS) shows the ~0 impact of stereotype threat on female math performance.
Fantastic, arduous work from leads Andrea Stoevenbelt, Paulette C. Flore, and Jelte Wicherts (and DU alum @luis-a-gomez.bsky.social) #stats
osf.io/preprints/ps...
And, of course, each barrier raised to junk will have some false positives or harms, so the more content-based protections added, the marginally worse off the good-faith scientists.
Maybe we need some type of reputational scoring system for individuals, institutions, and journals.
Of course, journals are trying: www.science.org/content/arti...
But this is whack-a-mole - this policy will bar one specific type of junk, and only after hundreds of junk papers have been indexed into the literature. I don't think journals are equipped to respond quickly to this kind of threat.
Some frank talk about paper mills from @sfnjournals.bsky.social .
(maybe @cbmetznancy.bsky.social ).
Really hard to see how the integrity of scientific publishing will hold up against the incentives and the tools to cheat.
#neuroscience
www.eneuro.org/content/13/1...
Too many significance tests!!
Made this little graphic for my #stats class, showing the various kinds of (N)HST and how interpreting confidence intervals can replace all of them.
Made with #rstats #ggplot (duh)
Most neuroscience data is *not* normally distributed... so what should you do about it?
Here's a great tutorial from @mikemalekahmadi.bsky.social focused on regression techniques for non-normal data: www.eneuro.org/content/13/1...
#stats
#neuroscience
Another paper I want to read, this one on the interpretation of effect sizes, with data on actual use in psychology.
Seems to be another contribution to sentiment that we need to get over Cohen's d -- raw score effect sizes are usually best!
doi.org/10.1057/s415...