MICE's Avatar

MICE

@amices.bsky.social

Home of the #RStats imputation package {mice}. Amices is a place for people interested in solving missing data problems. See https://amices.org & https://github.com/amices. Posts by @oberman.bsky.social

8 Followers  |  4 Following  |  1 Posts  |  Joined: 06.01.2025  |  1.3302

Latest posts by amices.bsky.social on Bluesky


Image of R code. To reproduce:

library(ggplot2)
library(dplyr)

library(mice, warn.conflicts = FALSE)
 
imp <- mice(nhanes, m = 5, maxit = 5, seed = 1, 
            ignore = rep(c(FALSE, TRUE), c(20, 5)), 
            print = FALSE)
 
impdats <- complete(imp, "all")
 
train <- lapply(impdats, function(dat) subset(dat, !imp$ignore))
test <- lapply(impdats, function(dat) subset(dat, imp$ignore))
 
fits <- lapply(train, function(dat) lm(age ~ bmi + hyp + chl, data = dat))
preds <- predict_mi(object = fits, newdata = test, pool = TRUE, interval = "prediction")
 
preds
 
preds %>% 
  as.data.frame() %>% 
  mutate(case = 1:nrow(preds),
         y = test[[1]]$age) %>% 
  ggplot(aes(x = fit, y = case, col = rowSums(is.na(nhanes[imp$ignore,]))>0)) +
  geom_point() +
  geom_errorbar(aes(xmin = lwr, xmax = upr)) +
  theme_minimal() +
  scale_color_manual(values = mice::mdc(1:2), labels = c("observed", "missing")) +
  theme(legend.title = element_blank(),
        legend.position = "bottom") +
  labs(x = "prediction",
       title = "Pooled prediction intervals")

Image of R code. To reproduce: library(ggplot2) library(dplyr) library(mice, warn.conflicts = FALSE) imp <- mice(nhanes, m = 5, maxit = 5, seed = 1, ignore = rep(c(FALSE, TRUE), c(20, 5)), print = FALSE) impdats <- complete(imp, "all") train <- lapply(impdats, function(dat) subset(dat, !imp$ignore)) test <- lapply(impdats, function(dat) subset(dat, imp$ignore)) fits <- lapply(train, function(dat) lm(age ~ bmi + hyp + chl, data = dat)) preds <- predict_mi(object = fits, newdata = test, pool = TRUE, interval = "prediction") preds preds %>% as.data.frame() %>% mutate(case = 1:nrow(preds), y = test[[1]]$age) %>% ggplot(aes(x = fit, y = case, col = rowSums(is.na(nhanes[imp$ignore,]))>0)) + geom_point() + geom_errorbar(aes(xmin = lwr, xmax = upr)) + theme_minimal() + scale_color_manual(values = mice::mdc(1:2), labels = c("observed", "missing")) + theme(legend.title = element_blank(), legend.position = "bottom") + labs(x = "prediction", title = "Pooled prediction intervals")

Cool stuff!

Florian van Leeuwen and I implemented a prediction function in the #mice package that allows the incorporation of missing data uncertainty in a prediction interval.

The `predict_mi()` function is available in the current development version: github.com/amices/mice

#Rstats #statsky

29.09.2025 13:32 β€” πŸ‘ 18    πŸ” 3    πŸ’¬ 2    πŸ“Œ 0

Is mean imputation effective?
Yesβ€”at filling in the missings; but also
Yesβ€”at biasing any subsequent analyses.
stefvanbuuren.name/fimd

03.02.2025 18:39 β€” πŸ‘ 2    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0
Advanced Techniques for Handling Missing Data in analysis and prediction workflows | Utrecht Summer School This 4-day course provides cutting-edge techniques for addressing missing data problems, focusing on the intersection of statistical theory and modern machine learning workflows.

πŸ“Š Struggling with missing data in your analyses? Join our 4-day course 'Advanced Techniques for Handling Missing Data' at @utrechtuniversity.bsky.social!

πŸ“… 24–27 Mar 2025
πŸ“ Utrecht, NL
πŸ’Ά €730 | 1.5 ECTS

Learn cutting-edge imputation with {mice} in R. Apply by March 10th!

#RStats #DataScience

06.01.2025 14:51 β€” πŸ‘ 7    πŸ” 2    πŸ’¬ 0    πŸ“Œ 0

@amices is following 4 prominent accounts