Max kuhn

Max kuhn

@topepo.bsky.social

Writing modeling packages at @posit.co (née RStudio). Opinions are my own. https://max-kuhn.org/

4,893 Followers 297 Following 180 Posts Joined Aug 2023
2 days ago
Preview
orbital 0.5.0 orbital 0.5.0 is on CRAN! More models and faster execution.

I'm over the moon excited to the release of 0.5.0 of orbital 🛰️

This release adds full support for boosted tree, faster creation of orbital objects, optimization of execution!

We can finally reliably predict with a xgboost model from a database!

tidyverse.org/blog/2026/03...
#rstats #tidymodels

46 11 2 0
2 days ago
Video thumbnail

something big is coming 🛰️

31 2 0 0
3 days ago

I’ve also made some headway on the python companion, but I’m not quite ready to release that yet.

1 0 0 0
3 days ago

Slow and steady! I’m hoping to have (single) classification tree chapter done in the next month and then probably off to boosting.

It’s hard to determine what a release should be since we’re going nonlinearly through the chapters (that pun was intended).

0 0 2 0
1 week ago

🤩

2 1 0 0
1 week ago
YouTube
David Robinson - Teach the Tidyverse to Beginners YouTube video by Lander Analytics

@drob.bsky.social’s talk from 7 years ago covers this well

youtu.be/dT5A0sAWc2I

3 0 0 0
1 week ago

I presented at Shiny in Production 2025, an incredible conference hosted by @jumpingrivers.com up in Newcastle! I was glad to share the very latest from the Shiny team directly. The topics were bleeding edge at the time, so still new and relevant now. My video: www.youtube.com/watch?v=vxai...

22 5 0 0
1 week ago
A pink and blue graphic reading "apply for our opportunity scholarship to posit::conf(2026)."

We are covering 40 people's travel, lodging, and registration for posit::conf() this fall! If you are from a group that is underrepresented in data science or open source, please consider applying for the Opportunity Scholarship—we'd love to have you join.

posit.co/blog/apply-t...

21 15 2 1
1 week ago

March's tabular playground
#rstats #databs #tidytuesday
www.kaggle.com/code/jimgrum...

3 1 0 0
1 week ago
Screenshot of both sides of the printable version of the cheatsheet Screenshot of the web version of the recipes cheatsheet

#tidymodels now has its very first cheatsheet! "Preprocessing data with {recipes}" is now available in Web and PDF versions here: rstudio.github.io/cheatsheets/... #rstats #posit #rstudio

49 14 0 1
1 week ago

A very helpful post on AI helpers, especially if you are new to AI and Claude

13 1 1 0
2 weeks ago

SAS’s sums of squares “types” was the ancestral language war with S(plus) back in the day. So quaint compared to now.

4 0 1 0
2 weeks ago

Too bad Claude can’t shovel snow.

9 0 2 1
3 weeks ago

Similarly, clinical trial people started calling observational data or non-randomized trials "real-world data" like _that_ is the abnormal case.

1 0 0 0
3 weeks ago

Yeah, it basically means a model built on the standard rectangular data structure.

That's what most of the world's data is, but since deep learning is all about images and non-tabular data (e.g., text), we have to give it a special name like it's the exception.

3 0 1 0
3 weeks ago
Video thumbnail

Here’s a clip from Max Kuhn (@topepo.bsky.social) of Posit breaking down how we can truly quantify LLM performance using a clear, generalizable framework.

See the full conference talk here: youtu.be/TQKbaIR-8J4

#AI #MachineLearning #DataBS

3 1 0 0
3 weeks ago
Preview
How to choose the best LLM using R and vitals Use the vitals package with ellmer to evaluate and compare the accuracy of LLMs, including writing evals to test local models.

Want to check if code using #GenAI generates the responses you want? Here's how to automate LLM evals with the {vitals} #RStats 📦 by @simonpcouch.com @posit.co
My latest at #InfoWorld:
www.infoworld.com/article/4130...
#LLMs

7 2 0 0
3 weeks ago

random.seed(42)

</cringe>

4 1 1 0
3 weeks ago

We owe sooo much to @yihui.org Thank you!

13 0 1 0
3 weeks ago

Let me put it this way… it was much much easier than trying to convince the same AI to just stop using the test set over and over and over again. 😃

2 0 0 0
3 weeks ago
Post image

With bookdown.org being decommissioned, I've been working on converting #rstats books from #bookdown to #quartopub. After spending a lot of time with Claude to get it right for my two books, here's a repo that might help you do the same (especially with Claude Code):

github.com/topepo/bookd...

62 10 1 2
3 weeks ago
TikZ.net – Graphics with TikZ in LaTeX Graphics with TikZ in LaTeX

I’m like this with tikz (tikz.net). The best looking figures and diagrams but I don’t have the time to sort through its arcana.

2 0 0 0
1 month ago

Dumping the main points of both these posts into a skill is a really easy way to use claude to try the conversions (and check unit tests) to see if these changes can help speed things up without regressions.

0 0 0 0
1 month ago

A while back, @simonpcouch.com wrote this relevant post for package maintainers to help them convert code from dplyr/tidyr to vctrs.

tidyverse.org/blog/2023/04...

5 1 1 0
1 month ago
Preview
`dplyr::if_else()` and `dplyr::case_when()` are up to 30x faster dplyr 1.2.0 comes with much faster and more memory efficient `if_else()` and `case_when()` functions!

Last week we released dplyr 1.2.0, but we left off something VERY important 🙂

`dplyr::if_else()` and `dplyr::case_when()` are now up to 30x faster and use 10x less memory!

We dive into how we achieved these numbers in this new #rstats post!

tidyverse.org/blog/2026/02...

127 21 4 1
1 month ago
Post image

For more than a year I have been working on a brand new Jupyter Notebook editor for Positron. This is a ground-up build of a new Jupyter Notebook experience built to leverage all the knowledge and tools Posit/Positron brings to the data science table. 🧵#jupyter

26 7 1 1
1 month ago
Preview
dplyr 1.2.0 dplyr 1.2.0 fills in some important gaps in dplyr's API: we've added a new complement to `filter()` focused on dropping rows, and we've expanded the `case_when()` family with three new recoding and re...

dplyr 1.2.0 is out now and we are SO excited!

- `filter_out()` for dropping rows

- `recode_values()`, `replace_values()`, and `replace_when()` that join `case_when()` as a complete family of recoding/replacing tools

These are huge quality of life wins for #rstats!

tidyverse.org/blog/2026/02...

465 133 12 13
1 month ago

Edgar all that work, including heroic efforts to translate TI calculator code to SQL for *prediction intervals*.

I didn’t believe it was possible until he did it. 🍌🍌🍌

1 0 0 0
1 month ago
Preview
caretForecast Conformal Time Series Forecasting Using Machine Learning

The hexagon here is priceless 😎

taf-society.github.io/caretForecast/

#rstats #timeseries

16 2 1 0
1 month ago
Large Language Models for Natural Language Processing in R or Python with the {mall} package Join us with Edgar Ruiz at the Data Science Lab Tuesday Jan 27 at 12pm ET pos.it/dslab

Tomorrow at the Data Science Lab 🧪 we are hearing from the amazing @theotheredgar.bsky.social about the {mall} package:

Run Natural Language Processing against your #RStats tibbles or #Python Polars DataFrames for sentiment analysis, text summaries, and more!

Join us at 12 pm ET: pos.it/dslab

17 2 0 0