Damon C. Roberts's Avatar

Damon C. Roberts

@damoncroberts.io.bsky.social

PhD | data scientist | bayesian stats damoncroberts.io Reposts are not endorsements. I do not represent anyone’s views.

801 Followers  |  732 Following  |  50 Posts  |  Joined: 16.06.2023  |  2.0728

Latest posts by damoncroberts.io on Bluesky

Post image

Good morning. Here is my reply to Matt Yglesias' "reply" to my article on moderates. This is a comprehensive accounting of my and other evidence, with some clarifications of findings and my position. I hope you will read and share.

www.gelliottmorris.com/p/data-over-...

19.08.2025 12:05 — 👍 171    🔁 38    💬 11    📌 10

> voice as a primary input method for using your PC

No.

13.08.2025 22:59 — 👍 565    🔁 15    💬 3    📌 1

I see a lot of folks posting about that NYTimes article. So I'm just going to say something I know you all already know, but is important.

You cannot infer the effect on unemployment of majoring in one topic vs another by comparing the unemployment rates of folks who chose those majors.

10.08.2025 21:05 — 👍 25    🔁 5    💬 3    📌 0
The rise and fall of Bayesian statistics | Statistical Modeling, Causal Inference, and Social Science

The rise and fall of Bayesian statistics
statmodeling.stat.columbia.edu/2025/08/10/t...

10.08.2025 13:48 — 👍 12    🔁 3    💬 0    📌 1

Imagine if we had something like cmdstan with Zig — take advantage of compile time execution

10.08.2025 22:02 — 👍 1    🔁 0    💬 0    📌 0

llms have great potential but incentives favor being stupid. to “know less”, you need abstraction — omit some details for clearer picture, keep a recipe you can unfold to go back. lose that recipe, and you’re doomed to forever treat the symptoms, while the symptoms get ever more complicated

10.08.2025 09:50 — 👍 53    🔁 2    💬 2    📌 0
Noooo you must carefully engineer your prompt in a way that properly accommodates the properties of the token-based model space, the better to manage the quasi-aliasing that happens at token boundaries; moreover the probabilistic character of model outputs makes the tool a poor choice for the evaluation of certain deterministic world-knowledge topics; furthermo
34%
34%
I don't trust this fucken thing
I don't trust this fucken thing
14%
14%
0.1%
Ю score
55
2%
2%
0.1%
70
85
100
115
130
145

Noooo you must carefully engineer your prompt in a way that properly accommodates the properties of the token-based model space, the better to manage the quasi-aliasing that happens at token boundaries; moreover the probabilistic character of model outputs makes the tool a poor choice for the evaluation of certain deterministic world-knowledge topics; furthermo 34% 34% I don't trust this fucken thing I don't trust this fucken thing 14% 14% 0.1% Ю score 55 2% 2% 0.1% 70 85 100 115 130 145

08.08.2025 11:37 — 👍 659    🔁 116    💬 5    📌 7
Preview
The Effects of Voting by Mail on Correct Voting - Political Behavior The share of Americans voting by mail surged in 2020. For those casting a mail-in ballot, their voting experience was different from those who voted in-person. When voting by mail, people can make the...

Just out at @polbehavior.bsky.social w/Carey Stapleton:

Voting by mail has the upside of boosting correct voting.

When people vote by mail rather than in-person, they are more likely to choose the presidential candidate best aligned with their preferences.

link.springer.com/article/10.1...

06.08.2025 14:53 — 👍 79    🔁 35    💬 4    📌 4
Measurement error in the strike zone | Statistical Modeling, Causal Inference, and Social Science

Measurement error in the strike zone
statmodeling.stat.columbia.edu/2025/08/02/m...

02.08.2025 13:49 — 👍 7    🔁 2    💬 0    📌 0
Post image

Currently in FirstView: In “Addressing Measurement Errors in Ranking Questions for the Social Sciences,” Yuki Atsusaka and @sysilviakim.bsky.social examine the statistical consequences of measurement error and introduce a framework for improving ranking data analysis.

17.07.2025 17:45 — 👍 10    🔁 5    💬 1    📌 0
Preview
a crowd of people are sitting in a theatre applauding . ALT: a crowd of people are sitting in a theatre applauding .
02.08.2025 00:03 — 👍 1    🔁 0    💬 0    📌 0
Bayesian Paired Comparisons for power ranking MLB teams (2019-2025) – A neglected blog

Not the Rockies.

blog.damoncroberts.io/posts/baseba...

30.07.2025 01:37 — 👍 1    🔁 0    💬 0    📌 0

Added the 2025 models using data as of yesterday's games:
blog.damoncroberts.io/posts/baseba...

30.07.2025 01:31 — 👍 0    🔁 0    💬 0    📌 0

That’s great, thank you!

28.07.2025 23:48 — 👍 1    🔁 0    💬 0    📌 0

It’s not the Rockies.

28.07.2025 23:18 — 👍 3    🔁 0    💬 1    📌 0

Hopefully the post will be ready tomorrow!

28.07.2025 23:18 — 👍 0    🔁 0    💬 1    📌 0
Bayesian modeling in Julia: Turing and Stan – A neglected blog

A quick blog post of me comparing the benchmarks and syntax of Bradley-Terry models to compute power rankings of MLB teams in Turing.jl and cmdstanr:

blog.damoncroberts.io/posts/julia_...

28.07.2025 22:55 — 👍 16    🔁 1    💬 4    📌 1

sure! could be cool.

27.07.2025 21:20 — 👍 0    🔁 0    💬 1    📌 0

Benchmarks?

27.07.2025 21:13 — 👍 0    🔁 0    💬 1    📌 0
Post image

2025 MLB model

27.07.2025 21:05 — 👍 2    🔁 0    💬 1    📌 1

Still not as bad as Microsoft Teams

26.07.2025 21:24 — 👍 746    🔁 176    💬 12    📌 5

The pipe is very very nice.

25.07.2025 17:30 — 👍 0    🔁 0    💬 0    📌 0

Maybe the tidyverse is a little bloated

22.04.2025 15:18 — 👍 34    🔁 2    💬 6    📌 0

The tidyversification of R honestly is the biggest reason I don’t touch the language since leaving academia. Even if I were still there, I’d be really annoying about it and say we shouldn’t do it if we care about replication. data tables exists and is great, but everything is built on tidyverse now

25.07.2025 17:19 — 👍 1    🔁 0    💬 1    📌 0

For ultimate performance: time to learn binary baayybee!

25.07.2025 16:29 — 👍 2    🔁 0    💬 1    📌 0

Im not gonna write assembly. But, dplyr built on a ton of dependencies built on base R built on an interpreter for C just to change a string to an integer.

25.07.2025 15:14 — 👍 2    🔁 1    💬 2    📌 0

All I’m saying is: the Kalman filter was used for the Apollo missions, we should be capable of running some models locally. If not, it’s not because the data is too large or the hardware sucks. It might be the endless appetite for extra layers of abstraction away from the code doing the actual work

25.07.2025 15:11 — 👍 6    🔁 1    💬 1    📌 0
Why Performance Actually Matters (The Standup)
YouTube video by ThePrimeTime Why Performance Actually Matters (The Standup)

youtu.be/RlTVMi4JzZA?...

21.07.2025 18:04 — 👍 0    🔁 0    💬 0    📌 0
Why Performance Actually Matters (The Standup)
YouTube video by ThePrimeTime Why Performance Actually Matters (The Standup)

“I should just stick with a coding language because it works, but to not move with my feet or demand it be more performant because a little bit more latency doesn’t matter enough to justify the effort.“

A great articulation of why this perspective is something we should push back on:

21.07.2025 18:04 — 👍 0    🔁 0    💬 1    📌 0

@damoncroberts.io is following 20 prominent accounts