Frank Harrell's Avatar

Frank Harrell

@f2harrell.bsky.social

Professor of Biostatistics Vanderbilt University School of Medicine Expert Biostatistics Advisor FDA Center for Drug Evaluation and Research https://hbiostat.org https://fharrell.com

7,684 Followers  |  137 Following  |  1,192 Posts  |  Joined: 12.10.2023
Posts Following

Posts by Frank Harrell (@f2harrell.bsky.social)

Preview
Researcher ‘honestly shocked’ to discover name on paper, editor claims misunderstanding While reviewing her Google Scholar profile to prepare a list of her publications, psychologist Maryam Farhang came across a paper she didn’t recognize.  The article, in the Journal of Research…

While reviewing her Google Scholar profile to prepare a list of her publications, psychologist Maryam Farhang came across a paper she didn’t recognize.

The article included her name and affiliation, but shevhadn’t written or contributed to the paper in any way.

27.02.2026 21:13 — 👍 52    🔁 14    💬 0    📌 1

The next time they tell you there’s no money for healthcare, remember there was money to start a war with Iran.

The next time they tell you there’s no money for housing or social supports, remember there was money to bomb a girls elementary school.

There’s always money for war.

28.02.2026 23:52 — 👍 462    🔁 166    💬 10    📌 5

In many observational studies it is a push to even call the sample a 'cohort'. For example in electronic health record-based studies we seldom know what makes a patient enter our health system, and anything about the patient that occurred while in a previous system is unknown.

01.03.2026 13:21 — 👍 0    🔁 0    💬 0    📌 0
Preview
7  Nonparametric Statistical Tests – Biostatistics for Biomedical Research

There might be something gained by framing this as a rank difference test, e.g., stacking the data and using an ordinal model with random intercepts for persons as exemplified in hbiostat.org/bbr/nonpar#s...

01.03.2026 13:19 — 👍 2    🔁 0    💬 1    📌 0

If by name you are referring to TTE I strongly disagree. Emulation means to copy, and blinding, randomization, simple time zero, and prospective follow-up are not being copied.

01.03.2026 13:15 — 👍 3    🔁 0    💬 1    📌 0

How does that help control for unmeasured confounders? How do most observational data collections even think about confounders?

28.02.2026 21:28 — 👍 3    🔁 0    💬 1    📌 0

What's new is valuable but has nothing to do with accounting for confounding.

28.02.2026 21:27 — 👍 2    🔁 0    💬 1    📌 0
Practical elements to consider when emulating a target trial When carefully designed and analyzed, target trial emulation, i.e., the analysis of observational data through the explicit emulation of the design components of randomized controlled trials (RCT), ca...

How is it that a paper that claims to show how to do target trial emulation does not address confounding by indication and its ramifications for data collection? www.jclinepi.com/article/S089... #EpiSky #StatsSky

28.02.2026 16:17 — 👍 18    🔁 6    💬 4    📌 1
13  Manipulation of Longitudinal Data – R Workflow

Much better, since ultra-fast data.table has a concise algebra for almost everything you need, with no package dependencies: Use LLM as a helper in constructing data.table commands as in this advanced example: hbiostat.org/rflow/long#s... #RStats

28.02.2026 13:59 — 👍 6    🔁 0    💬 1    📌 0

Let me just ask for an example like the one I posed, where the new method is used to draw inference about the unknown treatment effect.

27.02.2026 18:53 — 👍 0    🔁 0    💬 0    📌 0

Just took a quick look. Hard to see what the actual inference is, and how it would apply to longitudinal ANCOVA with a treatment x time interaction in the model + baseline covariates.

27.02.2026 17:35 — 👍 1    🔁 0    💬 1    📌 0

This sounds too good. Wouldn't us Americans miss our traffic jams, rampant consumerism and resulting trash and recycling challenges, power grids being hurt by AI data farms, and all the interesting compounds we have in our food, water, and air?

27.02.2026 17:26 — 👍 2    🔁 0    💬 0    📌 0

Isn't that about interpreting the process that generates confidence intervals, but doesn't help with interpreting the single computed CI in front of you? #StatsSky #Statistics

27.02.2026 17:19 — 👍 5    🔁 0    💬 1    📌 0

They cannot reliably estimate sample avg Tx effects (single differences) since they don't address confounding by indication, so why would they allow estimation of differential treatment effects (double diffs)? Contrast with Bayesian models admitting uncertainty: fharrell.com/post/hxcontrol

26.02.2026 17:51 — 👍 2    🔁 0    💬 0    📌 0

I am completely convinced that it is detrimental to teach tidyverse to students before they learn base R and perhaps data.table. I have looked at students' homeworks when they haven't followed that advice. Phew! #RStats

26.02.2026 17:20 — 👍 4    🔁 0    💬 1    📌 0

US men's olympic hockey team: cowards when it counts

26.02.2026 14:07 — 👍 7    🔁 0    💬 1    📌 0
My effort to reproduce this paper began as part of the Institute for Replication’s ongoing project to systematically examine the reproducibility and robustness of papers in Nature Human Behaviour3; my participation in this endeavour was approved by the Ethical Review Board of Vrije Universiteit Amsterdam’s School of Business and Economics. Inspecting the paper’s first two figures revealed a mathematical impossibility. There are nine EU countries that experienced zero terror attacks during the study’s time frame. However, the paper reports that the inverse hyperbolic sine of these countries’ per capita attack rates are positive, and increase or decrease over time. This is impossible; the inverse hyperbolic sine of zero is zero4. The main outcome variable displayed in the paper’s second figure is hard-coded in the replication data as ‘DVSin’. Figure 1’s top row of plots shows that DVSin is negatively correlated with both terrorist attack rates (r = −0.107, two-sided P = 0.024) and their inverse hyperbolic sine (r = −0.108, two-sided P = 0.022). These plots also show that in the 305/420 country-year observations after 2006 experiencing zero terror attacks (72.6%), DVSin takes on 292 different positive values. This implies that the paper’s main outcome variable cannot possibly be constructed as described in the paper.

My effort to reproduce this paper began as part of the Institute for Replication’s ongoing project to systematically examine the reproducibility and robustness of papers in Nature Human Behaviour3; my participation in this endeavour was approved by the Ethical Review Board of Vrije Universiteit Amsterdam’s School of Business and Economics. Inspecting the paper’s first two figures revealed a mathematical impossibility. There are nine EU countries that experienced zero terror attacks during the study’s time frame. However, the paper reports that the inverse hyperbolic sine of these countries’ per capita attack rates are positive, and increase or decrease over time. This is impossible; the inverse hyperbolic sine of zero is zero4. The main outcome variable displayed in the paper’s second figure is hard-coded in the replication data as ‘DVSin’. Figure 1’s top row of plots shows that DVSin is negatively correlated with both terrorist attack rates (r = −0.107, two-sided P = 0.024) and their inverse hyperbolic sine (r = −0.108, two-sided P = 0.022). These plots also show that in the 305/420 country-year observations after 2006 experiencing zero terror attacks (72.6%), DVSin takes on 292 different positive values. This implies that the paper’s main outcome variable cannot possibly be constructed as described in the paper.

"the paper’s main outcome variable cannot possibly be constructed as described in the paper."

Retraction of 2023 paper that did not use the reported variables. The replication report is astonishing www.nature.com/articles/s41...

Pre-publication peer review remains undefeated in laundering bullshit

26.02.2026 07:47 — 👍 50    🔁 14    💬 3    📌 4

@thelancet.com continues to fail in its role as a gatekeeper of good science, not only by a 2 week rule but by limiting letters to 250 words. How do you expose a severe but very technical flaw in a study's design or analysis in 250 words? #StatsSky

26.02.2026 13:54 — 👍 12    🔁 4    💬 2    📌 1

Brilliantly written with one small caveat: Towards the end the article implies that observational data may be useful for studying heterogeneity in treatment effects. Don't know why that would be the case. But great work Darren! #StatsSky #EpiSky

26.02.2026 13:48 — 👍 13    🔁 2    💬 1    📌 0

@thelancet.com continues to make a mockery of peer review at times.

26.02.2026 13:38 — 👍 16    🔁 7    💬 1    📌 0
Post image

I tried to tell y'all.

25.02.2026 10:52 — 👍 72    🔁 17    💬 5    📌 2

To clarify, you have to show excellent Bayesian or frequentist power at the max affordable N. But that N in no way should be considered "the" sample size.

24.02.2026 23:08 — 👍 3    🔁 0    💬 0    📌 0

hbiostat.org/bayes/bet/de...

24.02.2026 23:06 — 👍 0    🔁 0    💬 0    📌 0

Sorry - should be hbiostat.org/bayes/bet/de...

24.02.2026 14:33 — 👍 1    🔁 0    💬 0    📌 0

An alternate way to read this is to say that sample size calculations should be abandoned in favor of sequential analysis once the maximum affordable N is determined. hbiostat.org/bayes/beta/d...

24.02.2026 13:42 — 👍 3    🔁 0    💬 3    📌 0
Preview
The Republican Party Has a Nazi Problem How did the GOP become a haven for slogans and ideas straight out of the Third Reich?

Terrific piece by former Republican, Tom Nichols
"The Republicans have a Nazi problem, yes. But this means that the United States also has a Nazi problem.. . ."
www.theatlantic.com/magazine/202...

24.02.2026 10:22 — 👍 5    🔁 3    💬 0    📌 0
Preview
Justice Department withheld and removed some Epstein files related to Trump An NPR investigation finds the public database of Epstein files is missing dozens of pages related to sexual abuse accusations against President Trump.

NPR - DoJ withheld some Epstein files related to allegations that President Trump sexually abused a minor. www.npr.org/2026/02/24/nx-s1-...

24.02.2026 12:03 — 👍 36    🔁 23    💬 0    📌 3

This reminds me of an article I recently read about how liberal arts colleges in the US that do not have graduate students are giving their students better education by relying on nothing but real professors to teach. Not being "research universities" is a plus. Contrast with rampant use of TAs.

24.02.2026 13:33 — 👍 2    🔁 0    💬 1    📌 0
Post image

Goal-Driven Flexible Bayesian Design presentation updated w/comparison of performance of frequentist group sequential designs: hbiostat.org/bayes/design . Frequentist approach takes far too long to make a decision by controlling something that is NOT an error prob. #Statistics #StatsSky

23.02.2026 13:53 — 👍 7    🔁 0    💬 0    📌 0
Tips for Biostatisticians Collaborating with Non-Biostatistician Medical Researchers – Statistical Thinking In this talk I contrast consultation with collaboration and discuss various ways to make collaborations most effective. Some key components of effective collaboration are mutual respect, proper divisi...

It is vitally important that human statisticians stand up to collaborators who don't understand statistical principles or who do bad science (e.g., hope to learn a lot when p >> N). But those are precisely the skills NOT being taught in #Statistics programs. www.fharrell.com/talk/collab #StatsSky

23.02.2026 12:46 — 👍 16    🔁 3    💬 1    📌 0