pubmed.ncbi.nlm.nih.gov/29404569/
They start with two premises I don't support. 1) you can assess heterogeneity by eyeballing the trials and deciding whether they seem similar enough, and (2) if not, random effects is the solution. Random effects does not deal with the problem of heterogeneity!!!
I know, shocking, right?
If only AI / ML had been around when I was training, I wouldn’t have had to learn about things like causal inference, how to evaluate prediction models or even, say, the importance of data quality. What a waste of time all that was!
Any suggestions as to great resources (lectures on YouTube, short didactic papers etc), to teach novice researchers about RCTs? Design, endpoints, consent, eligibility criteria, IRB etc etc any and all of it.
Is this a great argument for most statisticians to use Stata?
see pubmed.ncbi.nlm.nih.gov/40914655/. In brief, a random effects meta analysis requires that we estimate the random effects variance, and we can't do that if we have only a few trials. Imagine you had measured a marker on 4 patients and someone asked you what the standard deviation was.
Revman and the idea of standardizing meta-analysis was brilliant in the 1990s. That is no longer the case.
Yes, we mention that here.https://pubmed.ncbi.nlm.nih.gov/40914655/. The post was really directed at the 99% of meta-analyses that use standard software and don't even include a statistician
Email me!
COULD EVERYONE PLEASE STOP USING RANDOM EFFECTS META-ANALYSIS WHEN COMBINING 3 OR 4 TRIALS? AND COULD REVIEWERS STOP DEMANDING IT?
Great take! calibration is critical for decision making (see Ben Van Calster on this point)
www.auajournals.org/doi/10.1097/... Lymph node dissection (LND) for radical prostatectomy: controversy about RCT, complication rate. Decision analysis puts numerical estimates on benefit, harm, uncertainty. Expected utility of LND was higher vs. no PLND across broad range of scenarios.
we have shown several times that once you know PSA, PRS is non-predictive (i can send references if you like)
Yet again, PRS do not differentially distinguish aggressive from indolent cancer. & BARCODE RCT showed poor results compared to MRI etc. Yet the authors give a thumbs up to genomics in prostate cancer screening. When is the PRS fever going to break? www.nature.com/articles/s43...
Too many meta-analyses have findings equivalent to: “If you average the cost of a loaf of bread, car insurance for a year and a movie ticket, you get $752.36”
Agree! Here is another: ratio of number of p values reported to number of patients in the study. I have seen several cases where this is > 1.
You forgot the bit: "Descriptive statistics were calculated as frequency and percentage for binary variables and mean (SD) for continuous variables, unless these were not normally distributed, in which can median and quartiles were reported"
1) A covariate that is predictive of outcome should be in the model even if unpredictive of assignment (eg matched pairs design).
2) A covariate that is not predictive of outcome should not be in the model, even if predictive of assignment.
3) The propensity score is stupid.
When discussing PSA screening policy, we often contrast two options as opportunistic vs. population-based PSA screening. Would suggest a name change to disorganized vs. organized PSA screening.
Number of papers on PubMed using the term "real world data" in 2000: 6. Number in 2025: ~5000. Number of papers for which "real world data" would be a meaningful scientific term: 0.
@amit_sud
PRS-based prostate cancer screening has worse properties than contemporary approaches: "BARCODE1 biopsied more men, diagnosed more low-grade PCs & detected fewer high-grade PCs versus Göteborg-2 and ProScreen." authors.elsevier.com/sd/article/S...
Right. Under Biden I said "I'm sorry for being white" at least five times a day (e.g. at bagel store, when I got in a cab) and often the guy at the bagel store / cab driver would say "I'm sorry for being white too". And then when Trump came in, I didn't have to say that any more. Such a timesaver!
No point in calculating, say, a p value unless you understand what it is. Or a mean vs. median unless you understand when you should report each rkbookreviews.wordpress.com/2012/05/27/w...
ProQuant collaboration: >100 urologists, radiation oncologists, pathologists, radiologists, biostatisticians, & ML experts from 34 institutions worldwide evaluating whether & how tumor quantification offers superior risk stratification to Gleason score. www.sciencedirect.com/science/arti...
Lysenko is a bit of a bogeyman in science. But I have to say, rereading his story, hard not to draw parallels with Prasad, Makary, Bhattacharya and Hoeg. en.wikipedia.org/wiki/Trofim_...
Our guidance regarding performance measures for medical AI models is finally out!
- Stop bashing AUROC, although it does not settle things
- Calibration and clinical utility are key
- Show risk distributions
- Classification statistics (e.g. F1) are improper
www.thelancet.com/journals/lan...
Completely agree with your analysis. With respect to AUPRC, standard recommendation is to report discrimination, calibration and clinical utility (eg decision curve). AUPRC is a from of discrimination, so i guess you could report instead of AUC. But no-one has ever explained why you should.