Bob C-J and Geoff Cumming's Avatar

Bob C-J and Geoff Cumming

@thenewstats.bsky.social

Open science, estimation statistics, and random thoughts from Bob Calin-Jageman and Geoff Cumming. https://thenewstatistics.com/itns/

333 Followers  |  194 Following  |  906 Posts  |  Joined: 09.11.2023  |  2.0375

Latest posts by thenewstats.bsky.social on Bluesky

3 of the researchers were participants... I know I wouldn't have been able to resist, either! Suit me up with my custom-printed barn-owl ears, please!

12.02.2026 18:41 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Why did they do this? To find out if the human auditory system is flexible enough to learn to localize sounds vertically, the way barn owls do. The answer: partly! But honestly, I'm just excited that they *did* this.

12.02.2026 18:41 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Figure from a research paper showing custom-printed barn-owl molds fitted to a human research participant.  Don't worry if you can't see them -- they are not nearly as cool as the phrase 'barn-owl ears' would have you expecting.

Figure from a research paper showing custom-printed barn-owl molds fitted to a human research participant. Don't worry if you can't see them -- they are not nearly as cool as the phrase 'barn-owl ears' would have you expecting.

Scientists created prosthetic barn-owl ears and got human participants to wear them for 3-59 days.

Yes, people walked around with barn-owl ears for up to 2 months *for science*.

This is the type of news I need today to restore a bit of my faith in humanity.

www.biorxiv.org/content/10.6...

12.02.2026 18:41 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Because traditional NHST doesn't allow the skeptic to win, we've maligned the need to do so: uninteresting, just mistakes, can't prove a negative, etc. But this is just rationalizing our way into accepting the limitations of the tool. Fruitful science must be able to rule in and rule out effects!

12.02.2026 18:33 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

"near 0" findings have been just essential for science as "not 0" findings. Pasteur, for example, showed that quantity of bacteria formed under sterile conditions is near 0, disproving the theory of spontaneous generation.

12.02.2026 18:33 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

So, just tests with results that we like?

I think you are confusing "effect is near 0" with uninteresting. Not true! Near 0 is interesting (and vital to publish) when a leading theory predicted an effect. It's interesting (and vital) to producing parsimonious models.

12.02.2026 18:33 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Personally, I think we need all valid studies published. What we need (I think) is to think more critically about quality/validityβ€” it is not about p value obtained or even ci widthβ€” it is evaluated from construct validity, positive and negative controls, design features, overlapping evidence, etc.

12.02.2026 14:44 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Can you clarify this a bit? Meaning, you don’t think wide cis (unclear) should be published but narrow ones (clear) should be? Or something else?

12.02.2026 14:40 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

A valid testing regime has the possible outcomes: yes, no, unclear. It makes no sense for a scientific literature to have only β€œyes” results. We support this regime with jaundiced ideas about null results that on reflection are poor epistemology.

12.02.2026 13:32 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

I really don’t understand the jaundiced view of validly estimating no effect (β€œnull results”). Don’t detectives work by ruling out suspects? Isn’t the greater part of learning weakening unhelpful associations? Don’t we accept Popper’s view that science advances through falsification?

12.02.2026 13:26 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

You run a study because you believe z is an important factor, your results estimate is is negligible.. that is new and important knowledge. Theory A and Theory B make contrasting predictions (large effect, no effect), data estimating a negligible difference helps arbitrate between these.

12.02.2026 13:20 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Judgement of quality/validity should not be based on if we like the results. Crap methods/ideas can diminish effects leading to p > .05 or inflate effects producing p < .05. Many studies do belong in the file drawer, but that p value generated is not a valid guide.

12.02.2026 13:14 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Sterling, 1959 updated. We still aren’t sharing science in a scientific manner. With 65 years of knowing the problem, how are we still struggling to address it?

11.02.2026 17:45 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

I haven’t! Seems like a cool idea, though likely a good bit harder than p, df, and test stat correspondence.. but would be cool!

29.01.2026 15:34 β€” πŸ‘ 1    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0
GitHub - ACCLAB/dabestr: Data Analysis with Bootstrap Estimation in R Data Analysis with Bootstrap Estimation in R. Contribute to ACCLAB/dabestr development by creating an account on GitHub.

For Python: github.com/ACCLAB/dabestr

For R: github.com/ACCLAB/DABES...

29.01.2026 15:08 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Getting over ANOVA: Estimation graphics for multi-group comparisons Data analysis in experimental science mainly relies on null-hypothesis significance testing, despite its well-known limitations. A powerful alternative is estimation statistics, which focuses on effect-size quantification. However, current estimation tools struggle with the complex, multi-group comparisons common in biological research. Here we introduce DABEST 2.0, an estimation framework for complex experimental designs, including shared-control, repeated-measures, two-way factorial experiments, and meta-analysis of replicates. ### Competing Interest Statement The authors have declared no competing interest.

So excited to see DaBest 2.0 is out: get bootstrapped estimation statistics for simple through complex designs, all with beautiful visualization, available in R and Python.

Check it out!

Pre-print describing new features for complex designs: www.biorxiv.org/content/10.6...

#stats

29.01.2026 15:08 β€” πŸ‘ 4    πŸ” 1    πŸ’¬ 2    πŸ“Œ 0
Preview
The Challenges and Solutions of In‐Person Big‐Team Science Big-Team Science (BTS) offers a powerful framework for advancing psychological research through large-scale collaboration, yet the unique challenges of conducting in-person BTS studies remain under-e....

Looking forward to reading this retrospective on the promise and perils of organizing an RRR with in-person data collected.

From: @aggieerin.bsky.social , @psforscher.bsky.social , and others.

#stats

doi.org/10.1111/spc3...

26.01.2026 01:42 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
RegCheck RegCheck is an AI tool to compare preregistrations with papers instantly.

Comparing registrations to published papers is essential to research integrity - and almost no one does it routinely because it's slow, messy, and time-demanding.

RegCheck was built to help make this process easier.

Today, we launch RegCheck V2.

🧡

regcheck.app

22.01.2026 11:05 β€” πŸ‘ 174    πŸ” 90    πŸ’¬ 8    πŸ“Œ 6
Preview
esci: Estimation Statistics with Confidence Intervals A collection of functions and 'jamovi' module for the estimation approach to inferential statistics, the approach which emphasizes effect sizes, interval estimates, and meta-analysis. Nearly all funct...

esci 1.0.9 is now on CRAN.

No big changes, just a couple of bug fixes and some compatibility changes for statpsych 1.9. Still a great easy way to get effect sizes and fair tests for many common designs.

cran.r-project.org/web/packages...

#stats

16.01.2026 23:58 β€” πŸ‘ 5    πŸ” 2    πŸ’¬ 0    πŸ“Œ 0
Post image

thenewstatistics.com/itns/2026/01...
A statistics textbook for the AI era

16.01.2026 22:44 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

thenewstatistics.com/itns/2026/01...

16.01.2026 22:40 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 1
Post image

The Faculty for Undergraduate Teaching Workshop is back this summer, July16-19, right outside of Chicago. A great conference with great people. Registration and abstract submission are open… it is going to be great!

#neuroscience

www.funfaculty.org/conference_2...

15.01.2026 21:50 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Proud to have played a small part in this as a data-collection site. And so proud of @luis-a-gomez.bsky.social , who organized this project at our university as an undegrad and has launched to great success in a clinical psych PhD program at Purdue.

14.01.2026 14:31 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
The stereotype-threat effect among
women was virtually null and not significant (N = 1275, g = 0.04, SE = 0.08, 95% CI [-0.11; 0.20]), and considerably smaller than the original study (N = 45, d = -0.82; 95% CI [-1.45; -0.18]).

The stereotype-threat effect among women was virtually null and not significant (N = 1275, g = 0.04, SE = 0.08, 95% CI [-0.11; 0.20]), and considerably smaller than the original study (N = 45, d = -0.82; 95% CI [-1.45; -0.18]).

Another threat to stereotype threat?

An RRR (in press, AMPPS) shows the ~0 impact of stereotype threat on female math performance.

Fantastic, arduous work from leads Andrea Stoevenbelt, Paulette C. Flore, and Jelte Wicherts (and DU alum @luis-a-gomez.bsky.social) #stats

osf.io/preprints/ps...

14.01.2026 14:31 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

And, of course, each barrier raised to junk will have some false positives or harms, so the more content-based protections added, the marginally worse off the good-faith scientists.

Maybe we need some type of reputational scoring system for individuals, institutions, and journals.

14.01.2026 14:13 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Journals and publishers crack down on research from open health data sets PLOS, Frontiers, and others announce policies trying to stem the tide of suspect research

Of course, journals are trying: www.science.org/content/arti...

But this is whack-a-mole - this policy will bar one specific type of junk, and only after hundreds of junk papers have been indexed into the literature. I don't think journals are equipped to respond quickly to this kind of threat.

14.01.2026 14:13 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
AI-Generated Scientific Papers: Crisis? What Crisis? Picture a man in a deckchair, umbrella overhead, relaxing with a drink in handβ€”while surrounded by industrial wasteland and decay. This was the iconic 1975 album cover for Supertramp's Crisis? What Cr...

Some frank talk about paper mills from @sfnjournals.bsky.social .
(maybe @cbmetznancy.bsky.social ).

Really hard to see how the integrity of scientific publishing will hold up against the incentives and the tools to cheat.

#neuroscience

www.eneuro.org/content/13/1...

14.01.2026 14:13 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Too many significance tests!!

Made this little graphic for my #stats class, showing the various kinds of (N)HST and how interpreting confidence intervals can replace all of them.

Made with #rstats #ggplot (duh)

12.01.2026 20:54 β€” πŸ‘ 106    πŸ” 28    πŸ’¬ 6    πŸ“Œ 3
Preview
Most Neuroscience Data Is Not Normally Distributed: Analyzing Your Data in a Non-normal World While the most common statistical tests assume that the error of the dependent variable follows a normal distribution, dependent variables in translational neuroscience studies often fail to meet this...

Most neuroscience data is *not* normally distributed... so what should you do about it?

Here's a great tutorial from @mikemalekahmadi.bsky.social focused on regression techniques for non-normal data: www.eneuro.org/content/13/1...

#stats
#neuroscience

09.01.2026 16:02 β€” πŸ‘ 12    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
The use and interpretation of unstandardized and standardized effect sizes in psychology: current practices and challenges - Humanities and Social Sciences Communications Humanities and Social Sciences Communications - The use and interpretation of unstandardized and standardized effect sizes in psychology: current practices and challenges

Another paper I want to read, this one on the interpretation of effect sizes, with data on actual use in psychology.

Seems to be another contribution to sentiment that we need to get over Cohen's d -- raw score effect sizes are usually best!

doi.org/10.1057/s415...

30.12.2025 23:11 β€” πŸ‘ 2    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

@thenewstats is following 19 prominent accounts