Right; you create a circular multiple imputation, nothing wrong with that.
04.03.2026 23:06 β π 1 π 0 π¬ 0 π 1@f2harrell.bsky.social
Professor of Biostatistics Vanderbilt University School of Medicine Expert Biostatistics Advisor FDA Center for Drug Evaluation and Research https://hbiostat.org https://fharrell.com
Right; you create a circular multiple imputation, nothing wrong with that.
04.03.2026 23:06 β π 1 π 0 π¬ 0 π 1That's my first choice, because it fully informs you of the importance of the value that is missing for the person being predicted.
04.03.2026 16:26 β π 3 π 0 π¬ 0 π 0John Fox was an incredible educator, proponent of excellent statistical practices, writer, R developer, and person. We will always miss him. His effect on us will never disappear.
04.03.2026 12:28 β π 11 π 6 π¬ 1 π 0Very nice interactive demonstration of maximum likelihood estimation. I added this link to hbiostat.org/rmsc/mle #Statistics #StatsSky
04.03.2026 12:24 β π 17 π 3 π¬ 0 π 0I joined the boycott. Consider joining yourself. All AI is dangerous but ChatGPT's developers are truly devoid of conscience.
04.03.2026 12:16 β π 12 π 5 π¬ 2 π 0Right, and the more single imputation resembles single-value fill-in methods the more wrong they get because they don't preserve the correlation structure of the predictors. This will ruin regression coefficients.
04.03.2026 12:01 β π 1 π 0 π¬ 1 π 0There are many ways, as described in our 2009 Clinical Chemistry article. Multiple imputation is attractive because it quantifies the cost of not collecting certain variables, by giving a range of predictions over multiple imputations before averaging the predictions for an overall estimate.
04.03.2026 12:00 β π 1 π 0 π¬ 1 π 0Recursive partitioning (classification and regression trees) use surrogate splits to handle missing data. I used to think that was a good idea but research has shown otherwise. Coupled with the 10 fold higher sample size needed for trees, I don't think this is a good choice.
04.03.2026 11:58 β π 1 π 0 π¬ 2 π 0I was referring to multiple imputation. Single imputation is seldom a good idea because it completely screws up standard errors of regression coefficient estimates. I do use single imputation when the fraction of missing is tiny. You're right re: omitting Y for single. hbiostat.org/rmsc/missing
03.03.2026 23:11 β π 3 π 1 π¬ 2 π 0No, it fails because it then under imputes missing X, resulting in biasing regression coefficients towards zero. Full Bayesian models don't impute but rather treat missing data as unknown parameters.
03.03.2026 19:16 β π 1 π 0 π¬ 1 π 0
I hope our 2009 paper gets remembered: academic.oup.com/clinchem/art...
Main advantages of Bayes: gets correct coefficients without using Y as an impure; exact inference; response order of data collection flow.
Richard - hope you touch on full Bayesian models which have a major advantage in this context - not using the outcome variable as an imputer, unlike multiple imputation, which requires Y to be used to impute X, making prospective prediction tricky.
03.03.2026 13:04 β π 3 π 0 π¬ 1 π 0good morning everyone project your personal imposter syndrome onto this gif ur welcome
03.03.2026 08:08 β π 132 π 35 π¬ 5 π 3Bayesian say that you should always have parameters for things you know you don't know.
03.03.2026 13:00 β π 2 π 0 π¬ 1 π 0Unmeasured potential confounders are easy to find in health research, e.g., insurance coverage, health-seeking behavior, diet, family cash on hand, ...
03.03.2026 12:59 β π 1 π 0 π¬ 1 π 0There are dramatic differences between Bayesian and frequentist in sequential designs. Here's an example where the sample size at which a conclusion is reached is dramatically smaller for Bayes: hbiostat.org/bayes/design - see the simulation of the Bayesian multi-goal design. #Statistics #StatsSky
03.03.2026 12:55 β π 5 π 1 π¬ 0 π 0I hope readers of this have their bullshit detectors set to maximum ...
03.03.2026 12:52 β π 14 π 1 π¬ 2 π 0I really like this chart: terminology for cause vs describe vs predict.
03.03.2026 12:45 β π 37 π 12 π¬ 1 π 0Screenshot of the "Does that use a lot of energy?" online app
Hannah Ritchie has built a fun little tool where you can compare energy usage of various products and activities.
This is super helpful imho, because it's so hard to develop intuitions even just about the scales involved here.
hannahritchie.substack.com/p/does-that-...
Thanks very much for this reference. The wish to find simpler formulas, which flies in the face of maximum likelihood estimation which is almost always iterative, is quite surprising. Wilson's CI is so easy to program (and has been in the R Hmisc package since 1991).
03.03.2026 12:40 β π 1 π 0 π¬ 0 π 0Amazingly thought provoking from @ruxandrabio.bsky.social : substack.com/home/post/p-...
03.03.2026 12:37 β π 5 π 2 π¬ 2 π 0Nice paper to know about. Related to my blog article, the inference from observational data with no randomized samples will come almost entirely from the prior distribution you put on observational study bias.
02.03.2026 15:27 β π 1 π 0 π¬ 1 π 0My impression was what Wilson confidence intervals worked great without adding pseudocounts. No?
02.03.2026 15:20 β π 0 π 0 π¬ 2 π 0Nice study. Had they evaluated how many of the 200 studies designed data collection post consideration of confounding, the results would have been far worse.
02.03.2026 15:18 β π 1 π 0 π¬ 0 π 0The problem with that is that the authors will then label it exploratory but try to reach the same conclusions anyway.
02.03.2026 15:11 β π 1 π 0 π¬ 1 π 0Yes, teaching design is always first priority. Observational studies are being published all the time now without even having a design phase. I'm going to blog about this before long.
02.03.2026 12:49 β π 3 π 0 π¬ 1 π 0Even though it can be impossible, to even attempt to do so on a study that had no pre-data-collection design phase is a major problem.
02.03.2026 12:47 β π 1 π 0 π¬ 1 π 0
While reviewing her Google Scholar profile to prepare a list of her publications, psychologist Maryam Farhang came across a paper she didnβt recognize.
The article included her name and affiliation, but shevhadnβt written or contributed to the paper in any way.
The next time they tell you thereβs no money for healthcare, remember there was money to start a war with Iran.
The next time they tell you thereβs no money for housing or social supports, remember there was money to bomb a girls elementary school.
Thereβs always money for war.
In many observational studies it is a push to even call the sample a 'cohort'. For example in electronic health record-based studies we seldom know what makes a patient enter our health system, and anything about the patient that occurred while in a previous system is unknown.
01.03.2026 13:21 β π 0 π 0 π¬ 0 π 0