Donald Szlosek's Avatar

Donald Szlosek

@dszlosek.bsky.social

Biostatistician @IDEXX formerly at harvardmed, @BIDMChealth, @nasa. Big data, clinical trials, and medical diagnostics. Mainer. Opinions are my own. he/him

1,055 Followers  |  4,741 Following  |  246 Posts  |  Joined: 13.11.2024  |  3.339

Latest posts by dszlosek.bsky.social on Bluesky

This whole interaction. Is exactly why I am on this platform. Experts, discussing and debating. Sharing resources. πŸ‘ πŸ‘ More of this please

27.11.2025 16:18 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Ghost time bias

27.11.2025 05:18 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Woah talk about a blast from the past. I've totally forgotten about this show. High quality animation for TV at the time too

27.11.2025 05:15 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

One thing I have sticky-noted on my desktop monitor is "know the difference between your dream job and your dream title" and I think you nailed it - you can publish, write blog posts, present at conferences all without Being a doctor.

26.11.2025 17:59 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

Because @laurewynants.bsky.social and @benvancalster.bsky.social do no get on here much #DataScience #MachineLearning #MissingData #Imputation #Rstats #Python #AI journals.plos.org/plosone/arti...

23.11.2025 14:08 β€” πŸ‘ 6    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

Oh Kutner 😭

23.11.2025 11:38 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
When grades stop meaning anything The UC San Diego math scandal is a warning

One of the most upsetting articles I've read in a long time www.theargumentmag.com/p/when-grade...

UCSD report senate.ucsd.edu/media/740347...

We are failing a generation of kids.

19.11.2025 02:05 β€” πŸ‘ 226    πŸ” 83    πŸ’¬ 25    πŸ“Œ 26
Post image

@f2harrell.bsky.social you were right. Coverage for TMLE xgboost grid search size 5 appeared better than 20 but still the coverage is disappointing. I’ve updated this examination for my learning. Thanks again for the guidance. www.kenkoonwong.com/blog/tmle/

22.11.2025 23:52 β€” πŸ‘ 3    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0

This is excellent!

23.11.2025 02:00 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

😀

22.11.2025 01:46 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

I 100% agree on the different hypothesis, I almost wrote a second reply to myself mentioning this (unless very strict assumptions apply). As for confidence intervals check out the Hodge-Lehman Estimator

21.11.2025 15:07 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
When to use the Wilcoxon rank-sum test instead of the unpaired t-test? This is a followup question to what Frank Harrell wrote here: In my experience the required sample size for the t distribution to be accurate is often larger than the sample size at hand. The

stats.stackexchange.com/questions/19...

21.11.2025 13:22 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

I was thinking about the efficiency of nonparametric tests and remembered this SO post about the Wilcox being 96% as efficient as T test, even in small samples. For those who have a copy, Lehman & Romano did an thorough job in <5 pages detailing situation. #statssky #statistics #rstats #AcademicSky

21.11.2025 13:22 β€” πŸ‘ 7    πŸ” 2    πŸ’¬ 3    πŸ“Œ 0
A table showing profit margins of major publishers. A snippet of text related to this table is below.

1. The four-fold drain
1.1 Money
Currently, academic publishing is dominated by profit-oriented, multinational companies for
whom scientific knowledge is a commodity to be sold back to the academic community who
created it. The dominant four are Elsevier, Springer Nature, Wiley and Taylor & Francis,
which collectively generated over US$7.1 billion in revenue from journal publishing in 2024
alone, and over US$12 billion in profits between 2019 and 2024 (Table 1A). Their profit
margins have always been over 30% in the last five years, and for the largest publisher
(Elsevier) always over 37%.
Against many comparators, across many sectors, scientific publishing is one of the most
consistently profitable industries (Table S1). These financial arrangements make a substantial
difference to science budgets. In 2024, 46% of Elsevier revenues and 53% of Taylor &
Francis revenues were generated in North America, meaning that North American
researchers were charged over US$2.27 billion by just two for-profit publishers. The
Canadian research councils and the US National Science Foundation were allocated US$9.3
billion in that year.

A table showing profit margins of major publishers. A snippet of text related to this table is below. 1. The four-fold drain 1.1 Money Currently, academic publishing is dominated by profit-oriented, multinational companies for whom scientific knowledge is a commodity to be sold back to the academic community who created it. The dominant four are Elsevier, Springer Nature, Wiley and Taylor & Francis, which collectively generated over US$7.1 billion in revenue from journal publishing in 2024 alone, and over US$12 billion in profits between 2019 and 2024 (Table 1A). Their profit margins have always been over 30% in the last five years, and for the largest publisher (Elsevier) always over 37%. Against many comparators, across many sectors, scientific publishing is one of the most consistently profitable industries (Table S1). These financial arrangements make a substantial difference to science budgets. In 2024, 46% of Elsevier revenues and 53% of Taylor & Francis revenues were generated in North America, meaning that North American researchers were charged over US$2.27 billion by just two for-profit publishers. The Canadian research councils and the US National Science Foundation were allocated US$9.3 billion in that year.

A figure detailing the drain on researcher time.

1. The four-fold drain

1.2 Time
The number of papers published each year is growing faster than the scientific workforce,
with the number of papers per researcher almost doubling between 1996 and 2022 (Figure
1A). This reflects the fact that publishers’ commercial desire to publish (sell) more material
has aligned well with the competitive prestige culture in which publications help secure jobs,
grants, promotions, and awards. To the extent that this growth is driven by a pressure for
profit, rather than scholarly imperatives, it distorts the way researchers spend their time.
The publishing system depends on unpaid reviewer labour, estimated to be over 130 million
unpaid hours annually in 2020 alone (9). Researchers have complained about the demands of
peer-review for decades, but the scale of the problem is now worse, with editors reporting
widespread difficulties recruiting reviewers. The growth in publications involves not only the
authors’ time, but that of academic editors and reviewers who are dealing with so many
review demands.
Even more seriously, the imperative to produce ever more articles reshapes the nature of
scientific inquiry. Evidence across multiple fields shows that more papers result in
β€˜ossification’, not new ideas (10). It may seem paradoxical that more papers can slow
progress until one considers how it affects researchers’ time. While rewards remain tied to
volume, prestige, and impact of publications, researchers will be nudged away from riskier,
local, interdisciplinary, and long-term work. The result is a treadmill of constant activity with
limited progress whereas core scholarly practices – such as reading, reflecting and engaging
with others’ contributions – is de-prioritized. What looks like productivity often masks
intellectual exhaustion built on a demoralizing, narrowing scientific vision.

A figure detailing the drain on researcher time. 1. The four-fold drain 1.2 Time The number of papers published each year is growing faster than the scientific workforce, with the number of papers per researcher almost doubling between 1996 and 2022 (Figure 1A). This reflects the fact that publishers’ commercial desire to publish (sell) more material has aligned well with the competitive prestige culture in which publications help secure jobs, grants, promotions, and awards. To the extent that this growth is driven by a pressure for profit, rather than scholarly imperatives, it distorts the way researchers spend their time. The publishing system depends on unpaid reviewer labour, estimated to be over 130 million unpaid hours annually in 2020 alone (9). Researchers have complained about the demands of peer-review for decades, but the scale of the problem is now worse, with editors reporting widespread difficulties recruiting reviewers. The growth in publications involves not only the authors’ time, but that of academic editors and reviewers who are dealing with so many review demands. Even more seriously, the imperative to produce ever more articles reshapes the nature of scientific inquiry. Evidence across multiple fields shows that more papers result in β€˜ossification’, not new ideas (10). It may seem paradoxical that more papers can slow progress until one considers how it affects researchers’ time. While rewards remain tied to volume, prestige, and impact of publications, researchers will be nudged away from riskier, local, interdisciplinary, and long-term work. The result is a treadmill of constant activity with limited progress whereas core scholarly practices – such as reading, reflecting and engaging with others’ contributions – is de-prioritized. What looks like productivity often masks intellectual exhaustion built on a demoralizing, narrowing scientific vision.

A table of profit margins across industries. The section of text related to this table is below:

1. The four-fold drain
1.1 Money
Currently, academic publishing is dominated by profit-oriented, multinational companies for
whom scientific knowledge is a commodity to be sold back to the academic community who
created it. The dominant four are Elsevier, Springer Nature, Wiley and Taylor & Francis,
which collectively generated over US$7.1 billion in revenue from journal publishing in 2024
alone, and over US$12 billion in profits between 2019 and 2024 (Table 1A). Their profit
margins have always been over 30% in the last five years, and for the largest publisher
(Elsevier) always over 37%.
Against many comparators, across many sectors, scientific publishing is one of the most
consistently profitable industries (Table S1). These financial arrangements make a substantial
difference to science budgets. In 2024, 46% of Elsevier revenues and 53% of Taylor &
Francis revenues were generated in North America, meaning that North American
researchers were charged over US$2.27 billion by just two for-profit publishers. The
Canadian research councils and the US National Science Foundation were allocated US$9.3
billion in that year.

A table of profit margins across industries. The section of text related to this table is below: 1. The four-fold drain 1.1 Money Currently, academic publishing is dominated by profit-oriented, multinational companies for whom scientific knowledge is a commodity to be sold back to the academic community who created it. The dominant four are Elsevier, Springer Nature, Wiley and Taylor & Francis, which collectively generated over US$7.1 billion in revenue from journal publishing in 2024 alone, and over US$12 billion in profits between 2019 and 2024 (Table 1A). Their profit margins have always been over 30% in the last five years, and for the largest publisher (Elsevier) always over 37%. Against many comparators, across many sectors, scientific publishing is one of the most consistently profitable industries (Table S1). These financial arrangements make a substantial difference to science budgets. In 2024, 46% of Elsevier revenues and 53% of Taylor & Francis revenues were generated in North America, meaning that North American researchers were charged over US$2.27 billion by just two for-profit publishers. The Canadian research councils and the US National Science Foundation were allocated US$9.3 billion in that year.

The costs of inaction are plain: wasted public funds, lost researcher time, compromised
scientific integrity and eroded public trust. Today, the system rewards commercial publishers
first, and science second. Without bold action from the funders we risk continuing to pour
resources into a system that prioritizes profit over the advancement of scientific knowledge.

The costs of inaction are plain: wasted public funds, lost researcher time, compromised scientific integrity and eroded public trust. Today, the system rewards commercial publishers first, and science second. Without bold action from the funders we risk continuing to pour resources into a system that prioritizes profit over the advancement of scientific knowledge.

We wrote the Strain on scientific publishing to highlight the problems of time & trust. With a fantastic group of co-authors, we present The Drain of Scientific Publishing:

a 🧡 1/n

Drain: arxiv.org/abs/2511.04820
Strain: direct.mit.edu/qss/article/...
Oligopoly: direct.mit.edu/qss/article/...

11.11.2025 11:52 β€” πŸ‘ 599    πŸ” 428    πŸ’¬ 8    πŸ“Œ 60

I have been a reviewer on 5 papers this year for AI/ML validation and each time they used R or R Squared. Had to request a Bier Score, Emax, even slope and intercept would be more ideal. Would be nice if ICI was more popular too

18.11.2025 18:02 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Also if you can wait 12+ hours on moderate sample sizes

17.11.2025 20:04 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
KHstats - An Illustrated Guide to TMLE, Part I: Introduction and Motivation

I'm sure you have seen this but Katherine Hoffman wrote an excellent blog post on TMLE: www.khstats.com/blog/tmle/tu...

17.11.2025 04:11 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Interesting read! In spirit it reminds me of the Vibration of Effects work by Patel (2015) and colleagues, although their work was solely focused on analytical choices.

16.11.2025 15:11 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

My knee jerk reaction is to almost always go with clopper-pearson (exact) mostly because the coverage is so good near 1 and 0 (or at least that is what I remember reading in a paper many years ago).

16.11.2025 14:48 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Survey Statistics: weights and MRP for voters | Statistical Modeling, Causal Inference, and Social Science

Survey Statistics: weights and MRP for voters
statmodeling.stat.columbia.edu/2025/11/11/s...

12.11.2025 22:05 β€” πŸ‘ 4    πŸ” 2    πŸ’¬ 0    πŸ“Œ 1

It’s fascinating how cohorts of statisticians have parallel pockets of hyper-specific knowledge due to textbooks, like the collective familiarity with blue fiddler crab mating patterns, courtesy of Agresti. Palmer penguins, iris data, ect, any others I'm missing? #statssky #statistics #rstats

12.11.2025 17:09 β€” πŸ‘ 6    πŸ” 0    πŸ’¬ 3    πŸ“Œ 0

I was recently working with a distraught med student who told me one of his classmates had 75 (!!) publications!

12.11.2025 11:00 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

I have wondered about this exact thought! Superb. Also love the Genstat output

11.11.2025 13:43 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

my chest hurts reading this.

11.11.2025 13:20 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Final Budget Reduction Plan | Budget Process | Nebraska

Well, it's official. #UNL Chancellor Bennett submitted his final proposal and it includes eliminating the #statistics department. budgetprocess.unl.edu/final-budget...

Guess it's time to go look for jobs. Anyone looking for a couple of very talented #datavis researchers?

10.11.2025 22:09 β€” πŸ‘ 3    πŸ” 4    πŸ’¬ 2    πŸ“Œ 1
Post image

Quick thread on the BBC and the political and societal significance of recent developments:

One of the main reasons the UK has historically been so much less polarised than the US, is that Britain has a shared source of information, consumed and trusted by most people regardless of their politics.

10.11.2025 13:43 β€” πŸ‘ 1262    πŸ” 512    πŸ’¬ 40    πŸ“Œ 59

I would be very curious to hear @maartenvsmeden.bsky.social thoughts on handling large volume on prognostic model comparison. Single database? Just smaller time window? I think I remember some justification of think in a PROSPERO doc of yours.

08.11.2025 19:24 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

One thing I have noticed a lot in meta-analysis (thinking of those involving comparison of prognostic models) is that most focus on Embase and Medline, but no mention of Compendium (think IEEE) WoS or Scopus. Granted the sheer volume for data extraction would be astronomical.

08.11.2025 19:21 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Does anyone else automictically reach for the PDF symbol when opening a new article on a journal site? It just seems like I can read the them faster. #AcademicSky

08.11.2025 14:31 β€” πŸ‘ 5    πŸ” 0    πŸ’¬ 3    πŸ“Œ 0

Exactly my thought re: multi-state model and msm package I mentioned earlier.

08.11.2025 14:27 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

@dszlosek is following 20 prominent accounts