Carlos Gómez Grajales's Avatar

Carlos Gómez Grajales

@cgrajales.bsky.social

Statistics, data and prog metal. That's it.

45 Followers  |  170 Following  |  26 Posts  |  Joined: 30.10.2024  |  1.7692

Latest posts by cgrajales.bsky.social on Bluesky

For analysis based on complex survey designs. For generating design-based or model-based estimations that consider weights and properly adjusted variances is way ahead, even when compared against specialized software.

30.10.2025 03:03 — 👍 3    🔁 0    💬 0    📌 0

Geocomputation. Faster and better than GIS even when Python available.

30.10.2025 01:23 — 👍 4    🔁 2    💬 0    📌 0
Video thumbnail

The making of this week's #TidyTuesday chart recorded with {camcorder} in #RStats 📹

23.10.2025 15:04 — 👍 77    🔁 9    💬 3    📌 1

The Pink Book of #MarginalEffects (aka Model to Meaning) ships next week and I've got a backlog of Zoolander memes.

Hope you're hungry for some spam in your timeline.

#RStats #PyData

22.09.2025 16:52 — 👍 89    🔁 18    💬 1    📌 3

The mlx_lm.server provides a rly nice OpenAI-compatible API server that serves up all the stored MLX models.

13.09.2025 14:03 — 👍 1    🔁 1    💬 0    📌 0

ggplot2 4.0.0 is out and the new `paper`, `ink`, `accent` theme variables look super cool! Just pick 2-3 colors 🎨 to make your plots look great! I'm excited to hook this up to brand.yml 😉

11.09.2025 12:49 — 👍 73    🔁 19    💬 1    📌 0
Models as Prediction Machines: How to Convert Confusing Coefficients into Clear Quantities

Abstract
Psychological researchers usually make sense of regression models by interpreting coefficient estimates directly. This works well enough for simple linear models, but is more challenging for more complex models with, for example, categorical variables, interactions, non-linearities, and hierarchical structures. Here, we introduce an alternative approach to making sense of statistical models. The central idea is to abstract away from the mechanics of estimation, and to treat models as “counterfactual prediction machines,” which are subsequently queried to estimate quantities and conduct tests that matter substantively. This workflow is model-agnostic; it can be applied in a consistent fashion to draw causal or descriptive inference from a wide range of models. We illustrate how to implement this workflow with the marginaleffects package, which supports over 100 different classes of models in R and Python, and present two worked examples. These examples show how the workflow can be applied across designs (e.g., observational study, randomized experiment) to answer different research questions (e.g., associations, causal effects, effect heterogeneity) while facing various challenges (e.g., controlling for confounders in a flexible manner, modelling ordinal outcomes, and interpreting non-linear models).

Models as Prediction Machines: How to Convert Confusing Coefficients into Clear Quantities Abstract Psychological researchers usually make sense of regression models by interpreting coefficient estimates directly. This works well enough for simple linear models, but is more challenging for more complex models with, for example, categorical variables, interactions, non-linearities, and hierarchical structures. Here, we introduce an alternative approach to making sense of statistical models. The central idea is to abstract away from the mechanics of estimation, and to treat models as “counterfactual prediction machines,” which are subsequently queried to estimate quantities and conduct tests that matter substantively. This workflow is model-agnostic; it can be applied in a consistent fashion to draw causal or descriptive inference from a wide range of models. We illustrate how to implement this workflow with the marginaleffects package, which supports over 100 different classes of models in R and Python, and present two worked examples. These examples show how the workflow can be applied across designs (e.g., observational study, randomized experiment) to answer different research questions (e.g., associations, causal effects, effect heterogeneity) while facing various challenges (e.g., controlling for confounders in a flexible manner, modelling ordinal outcomes, and interpreting non-linear models).

Figure illustrating model predictions. On the X-axis the predictor, annual gross income in Euro. On the Y-axis the outcome, predicted life satisfaction. A solid line marks the curve of predictions on which individual data points are marked as model-implied outcomes at incomes of interest. Comparing two such predictions gives us a comparison. We can also fit a tangent to the line of predictions, which illustrates the slope at any given point of the curve.

Figure illustrating model predictions. On the X-axis the predictor, annual gross income in Euro. On the Y-axis the outcome, predicted life satisfaction. A solid line marks the curve of predictions on which individual data points are marked as model-implied outcomes at incomes of interest. Comparing two such predictions gives us a comparison. We can also fit a tangent to the line of predictions, which illustrates the slope at any given point of the curve.

A figure illustrating various ways to include age as a predictor in a model. On the x-axis age (predictor), on the y-axis the outcome (model-implied importance of friends, including confidence intervals).

Illustrated are 
1. age as a categorical predictor, resultings in the predictions bouncing around a lot with wide confidence intervals
2. age as a linear predictor, which forces a straight line through the data points that has a very tight confidence band and
3. age splines, which lies somewhere in between as it smoothly follows the data but has more uncertainty than the straight line.

A figure illustrating various ways to include age as a predictor in a model. On the x-axis age (predictor), on the y-axis the outcome (model-implied importance of friends, including confidence intervals). Illustrated are 1. age as a categorical predictor, resultings in the predictions bouncing around a lot with wide confidence intervals 2. age as a linear predictor, which forces a straight line through the data points that has a very tight confidence band and 3. age splines, which lies somewhere in between as it smoothly follows the data but has more uncertainty than the straight line.

Ever stared at a table of regression coefficients & wondered what you're doing with your life?

Very excited to share this gentle introduction to another way of making sense of statistical models (w @vincentab.bsky.social)
Preprint: doi.org/10.31234/osf...
Website: j-rohrer.github.io/marginal-psy...

25.08.2025 11:49 — 👍 969    🔁 284    💬 47    📌 20
Post image

Erosion (R code)

Inspired by the ancient phreatic flows of Mammoth Cave National Park

#codeart #genartclub #genart

20.08.2025 03:58 — 👍 12    🔁 1    💬 0    📌 0
Post image

Two weeks ago, I gave a workshop on high-performance mapping apps with #rstats and Shiny.

Shiny is *awesome* for web mapping apps... but can be slow when deployed if you aren't careful.

Here are three strategies that have saved my projects:

21.07.2025 17:39 — 👍 16    🔁 3    💬 1    📌 0
Preview
Drop #672 (2025-06-27): If It Walks Like A… We have another 🦆-billed Drop today, featuring: DuckPlot — an open-source JavaScript library for generating charts from DuckDB with automatic SQL generation; The mcp-visualization-duckdb package, w…

Good #DuckDB vibes in today's Drop given all the horrible that happened today and this week in general.

— DuckPlot is a useful ObsPlot+DuckDB homunculus
— The DuckDB Viz MCP == no SQL req'd interactive dataviz
— DuckDB-QuickJS extension adds serious JS powers to DuckDB queries

27.06.2025 16:51 — 👍 13    🔁 3    💬 1    📌 0
Video thumbnail

Data science junkies, get ready! 🚀 "The Test Set" #podcast trailer is here for your viewing pleasure.

Tune in July 1st and every Tuesday after for new episodes with hosts @mchow.com, @hadley.nz, and @wesmckinney.com as they welcome thought leaders in #DataScience.

Subscribe now: pos.it/thetestset

18.06.2025 16:58 — 👍 104    🔁 37    💬 5    📌 1

This is amazing on so many levels

29.05.2025 03:12 — 👍 0    🔁 0    💬 0    📌 0
Post image

If you load this page it contacts 82 IP addresses executing 256 separate HTTP transactions to download 18MB of data writing 64 cookies to your device to tell you “no”

24.05.2025 10:37 — 👍 15049    🔁 4500    💬 148    📌 243

Thanks to everybody who chimed in!

I arrived at the conclusion that (1) there's a lot of interesting stuff about interactions and (2) the figure I was looking for does not exist.

So, I made it myself! Here's a simple illustration of how to control for confounding in interactions:>

11.05.2025 05:34 — 👍 1133    🔁 273    💬 68    📌 18
The package logo, a small cute elephant holding a quill and writing promptdown

The package logo, a small cute elephant holding a quill and writing promptdown

Just made promptdown public. It's a plain-text interface for working with LLMs using literate programming.

See and edit the full prompt each turn.

No cramped input boxes, no hidden context, no append-only chat.

Still early alpha, feedback welcome!

github.com/t-kalinowski...

08.05.2025 15:36 — 👍 29    🔁 8    💬 2    📌 0

You call it an ifelse statement, I call it a lightweight, agentic decision module operating within a deterministic inference framework

16.04.2025 03:27 — 👍 76    🔁 20    💬 6    📌 0

👀 async

08.04.2025 16:37 — 👍 0    🔁 0    💬 0    📌 0
Preview
Creating a dataset from an image using reticulate in R Markdown – %>% dreams A cool paper used R and Python together — and so can you!

Thx to @ivelasq3.bsky.social for pointing me to this blog post of hers: ivelasq.rbind.io/blog/reticul...

I used it as an example to "port" the code to a rixpress pipeline which you can find here: github.com/b-rodrigues/...

end result is here: b-rodrigues.github.io/rixpress_dem...

#RStats #Python

07.04.2025 20:36 — 👍 12    🔁 3    💬 0    📌 0
R code in the image:

```rstats
xdf[,c("handle", "name", "description")] |> 
  unique() |> 
  within({
    handle <- sub("^at://", "@", handle)
    md <- sprintf(
        fmt = "- (`%s`) %s%s%s", 
        handle, 
        name, 
        ifelse(is.na(description), "", ": "), 
        ifelse(is.na(description), "", description)
    )
  }) |> 
  sort_by(~md) |> 
  getElement("md") |> 
  writeLines()
```

R code in the image: ```rstats xdf[,c("handle", "name", "description")] |> unique() |> within({ handle <- sub("^at://", "@", handle) md <- sprintf( fmt = "- (`%s`) %s%s%s", handle, name, ifelse(is.na(description), "", ": "), ifelse(is.na(description), "", description) ) }) |> sort_by(~md) |> getElement("md") |> writeLines() ```

The base #RStats pipe (`|>`) is so stupid cool.

05.04.2025 19:15 — 👍 29    🔁 6    💬 1    📌 0

Loved this summary. This is incredibly relevant for data practitioners even beyond survey sampling.

03.04.2025 23:16 — 👍 2    🔁 0    💬 0    📌 0

Yesterday,
All Trump’s tariffs seemed so far away,
Now it looks as though they’re here to stay,
Oh, trade was better yesterday.

Suddenly,
We’re importing less from overseas,
There’s a tariff hanging over me,
The economy acts so suddenly.

1/3

03.04.2025 17:29 — 👍 4    🔁 2    💬 1    📌 0
Video thumbnail

A new release of the {mapgl} #rstats package - which brings Mapbox and MapLibre maps to R users - is now on CRAN!

There are lots of long-awaited features in this new release.

Learn more on my blog: walker-data.com/posts/mapgl-...

20.03.2025 14:15 — 👍 24    🔁 5    💬 2    📌 1

😍 Loved this in R but I needed it more in Python

19.03.2025 02:16 — 👍 1    🔁 0    💬 0    📌 0

Been consulting many big companies, for many years, and something like this has happened at some point in every single one of them.

14.03.2025 05:16 — 👍 0    🔁 0    💬 0    📌 0

We’ve had three years of github copilot and two of chatgpt, both promising magical productivity gains for software dev, and the only change noticed by regular users is that everything keeps getting shittier, but now with more useless chatbots

10.03.2025 10:09 — 👍 621    🔁 113    💬 23    📌 7

Fantastic resource for those of us (re) learning data viz and graphical story-telling

06.03.2025 05:15 — 👍 1    🔁 0    💬 0    📌 0

Oh cool, need to take this for a spin

04.03.2025 04:12 — 👍 1    🔁 0    💬 0    📌 0

honestly -- learning #shiny feels like superpower. For what you can do so quickly, it just blows my mind every time when compared to traditional web application frameworks

I feel so lucky to be learning and using #rstats.

01.03.2025 20:55 — 👍 13    🔁 2    💬 2    📌 0

AI is the future. Embrace it

26.02.2025 06:34 — 👍 0    🔁 0    💬 0    📌 0
Preview
Air, an extremely fast R formatter We are thrilled to announce Air, a new R formatter.

@lionelhenry.bsky.social and I are so excited to finally announce Air - an extremely fast R code formatter! 🎉

With Air, you'll never need to worry about styling your #rstats code ever again. All you need to do is save, and Air takes care of the rest.

www.tidyverse.org/blog/2025/02...

21.02.2025 15:10 — 👍 363    🔁 121    💬 20    📌 20

@cgrajales is following 20 prominent accounts