Is your working directory and/or your files on OneDrive? Strangely, this can cause these kinds of problems (the solution being storing things on a local drive instead).
No, but I'll check it out. Thanks!
train |> ... |> fit(train) gives my soul a papercut
Thanks, Di. I too am hoping that these issues will be fixed. Until then I'm sticking to caret in my teaching, as it also does a good job of coordinating machine learning software. I'm reluctant to tell students to use the tidymodels ecosystem because of the issues mentioned in my post.
Yeah, I think that's an important difference between tidymodels and ggplot2!
Kind of awkward to have to add data at two different steps. But definitely an improvement on the flow recommended in the documentation!
Absolutely. But in all the examples you mention, you'd start with data. And starting a pipeline with the data is the de facto standard in R. Creating a separate logic for how to pipe things is not very helpful to beginners.
Happy to submit these as issues within the next few days!
Great description! And so it boils down to whether you're willing to accept two different logics for how the pipe works. I maintain that the dual logics create more problems than they solve, but I get that some people like the tidymodels approach.
Some, yes. All ignored unfortunately. I agree that most of these issues could be fixed (which of course is the reason that I wrote this in the first place!)
Glad you liked it!
While we're discussing what we like and dislike in #Rstats, here's why I don't like tidymodels: mansthulin.se/posts/tidymo...
Josh Goldstein emailed me a nice tip for @rdatatable.bsky.social chaining: if we start a chained `data.table` operation inside a set of parens, we are no longer subject to the 'REPL constraint' and can keep each operation on a line. See ALT text. #rstats
Now in the pdf at github.com/eddelbuettel...
Well, you shouldn't use Python or MATLAB for statistics. Simple as. 😀
I love the hidden-gem magrittr pipes, but these days I stick with the base pipe. In this case, you can do:
mtcars |> with(cor(disp, mpg))
R/Medicine CFP is open 🩺🧪
Deadline: March 6 - still time!
Submit: Talks, Lightning Talks, Demos, Workshops - Using R + Shiny for health, lab, clinical data
First-time speaker? Email for feedback: rmedicine.conference@gmail.com
rconsortium.github.io/RMedicine_we...
#rstats #datascience
I love RStudio, but I'm flabbergasted by the fact that
@posit.co still haven't made |> the default for the Ctrl+Shift+C keyboard shortcut, despite their using it in e.g. the tidyverse documentation and R4DS. #Rstats
Thanks, will do!
Looks really nice! Is there an option to print confidence intervals instead of standard errors (the former being more informative)? If you'd be interested in adding bootstrap p-values/CIs as an option, I'd be happy to assist in integrating it with {boot.pval} (mthulin.github.io/boot.pval/ar...)
I've been playing around with {marginaleffects} in some projects lately, and I really like it. Lots of useful stuff in there! If you work with regression models and haven't checked it out already, I strongly recommend that you do so: marginaleffects.com/bonus/get_st...
#Rstats #Databs
How about penguins |> subset(select = c("island", "bill_len")) |> subset(island == "Biscoe" & bill_len > 55)
I cover both base and the tidyverse in Modern Statistics with R (expect for plotting, where I focus on ggplot2 and only briefly mention base): www.modernstatisticswithr.com
Gave the first lecture in my introductory statistics for biologist course yesterday, so this should come in handy. 😀 Thanks for sharing!
Some closing thoughts for my students this semester on LLMs and learning #rstats datavizf25.classes.andrewheiss.com/news/2025-12...
This is so useful. I usually add custom information to the box shown when hovering, using the text geom. An example can be found here: www.modernstatisticswithr.com/eda.html#det... #Rstats #statsky #databs
One of the things that has been on my to do list for a very long time, is building a gallery of all of the charts I've made across #TidyTuesday, #30DayChartChallenge, #30DayMapChallenge, and other miscellaneous projects 📊
And it's finally here!
Link: nrennie.rbind.io/viz-gallery/
#DataViz #RStats
"The difference between the groups is 1.1-2.6 measured using something that is a bit like the median but not quite the median" 😉
They test different hypotheses though, so the Wilcoxon test isn't a like-for-like replacement for the t-test. A bootstrap t-test is my go-to method for tests about means (using {boot.pval}). It has the added benefit of providing confidence intervals, unlike the Wilcoxon test.
Students often ask: “Is this model good enough?”
My reply: “For what?” AUC, precision, F1—none of them matter unless you know what decision you're informing. Always tie metrics to action.
#DataScience #MachineLearning #AI #RStats
It's great to use with pipes! Then everything goes from left to right.