STAT saw and raised
07.03.2026 19:31 β π 0 π 0 π¬ 0 π 0@ehudk.bsky.social
Research Staff Member at IBM Research. Causal Inference π΄βπ βπ‘. Machine Learning π€π. Data Communication π. Healthcare βοΈ. Creator of π²ππππππππ: https://github.com/IBM/causallib Website: https://ehud.co
STAT saw and raised
07.03.2026 19:31 β π 0 π 0 π¬ 0 π 0
always use a Mondrian theme in your ggplots
@statsepi.bsky.social
Note of reflection (March 5, 2020) This model was conceived in 2010, now more than 10 years ago, and not very long after Git itself came into being. In those 10 years, git-flow (the branching model laid out in this article) has become hugely popular in many a software team to the point where people have started treating it like a standard of sorts β but unfortunately also as a dogma or panacea. During those 10 years, Git itself has taken the world by a storm, and the most popular type of software that is being developed with Git is shifting more towards web apps β at least in my filter bubble. Web apps are typically continuously delivered, not rolled back, and you don't have to support multiple versions of the software running in the wild. This is not the class of software that I had in mind when I wrote the blog post 10 years ago. If your team is doing continuous delivery of software, I would suggest to adopt a much simpler workflow (like GitHub flow) instead of trying to shoehorn git-flow into your team. If, however, you are building software that is explicitly versioned, or if you need to support multiple versions of your software in the wild, then git-flow may still be as good of a fit to your team as it has been to people in the last 10 years. In that case, please read on. To conclude, always remember that panaceas don't exist. Consider your own context. Don't be hating. Decide for yourself.
To his credit, he states as much as the very beginning of the post. The type of software this approach aims for is rarely seen nowadays, and (as someone who once thought π΅π©πͺπ΄ πͺπ΄ π΅π©π¦ πΈπ’πΊ) it should be judged accordingly. no need to be too harsh on it.
18.02.2026 07:13 β π 3 π 0 π¬ 1 π 0
π£ NEW! Iβve just released the BIGGEST and perhaps most creative project Iβve ever worked on!
βSearching for Birdsβ searchingforbirds.visualcinnamon.com π€
A project, an article, an exploration that dives into the data that connects humans with birds, by looking at how we search for birds.
of course, this meme is magnetic π§² π§
06.02.2026 13:57 β π 0 π 0 π¬ 0 π 0Robocalypse: The Revival of the Mechanical Turk
04.02.2026 09:05 β π 1 π 0 π¬ 0 π 0it's a good view, don't get me wrong, but I also find it a bit limiting - because you can benefit from using random effects to pool estimates / shrink effects of finite factors, too, especially in imbalanced settings
04.02.2026 08:03 β π 2 π 0 π¬ 0 π 0that whammy bar on the guitar is excellent detail π
04.02.2026 07:04 β π 1 π 0 π¬ 1 π 0funny, I wasn't aware he had other fame aside from his spanning trees fame
30.01.2026 11:46 β π 1 π 0 π¬ 0 π 0
If you haven't watched the video abstract about Veronika the tool-using cow, you really should, especially the message from her owner at the end! It cleanses the timeline a bit.
π§ͺ
www.cell.com/current-biol...
Quarto's yaml autocomplete was my sole inspiration for coding a json schema for some internal package we developed that has a yaml/hydra interface.
was a fun Pydantic exercise but I wouldn't recommend to a friend...
*doesn't have to necessarily write all of the code, but thoughtful use of llms to contemplate different approaches rather than blindly accepting code shows care and more detail-oriented mentality and critical thinking, which are broader traits of interest in an era where everyone can now code.
28.01.2026 04:32 β π 0 π 0 π¬ 0 π 0
this. "why" questions are the key.
someone writing their own code* will have in depth considerations of choices made because they struggled with deciding on different implementations. but delegating coding to an llm makes them only a reader of the code, not so different than you reviewing it.
Always happy when you chime in. Interesting idea. Do you have any examples?
I read "error bars" in a frequentist sense, so you'll only have mean and two interval edges (rather than continuous throughout the interval?), and then dithering of three colors would seem too irregular?
If you can map your error bars to a 0-1 confidence there are ways to use color transparency and/or hex size (softening the edges) to make more uncertain hexes less prominent
23.01.2026 07:37 β π 1 π 0 π¬ 1 π 0
happy b-day wikipedia
wikipedia25.org/en
Both are "AI" in general (at least colloquially), but are different in how we create them, validate them, and deploy then to interact with the public
20.01.2026 10:16 β π 0 π 0 π¬ 0 π 0I think this is a result of the general public now conflating between AI Gen 1 (e.g., deep learning methods for computer vision in radiology or self-driving cars) and this newer GenAI era (natural language interface to everything).
20.01.2026 10:16 β π 1 π 1 π¬ 2 π 0
But my hunch is him using a fixed true ΞΈ, while yours is sampling from a prior for each trial (and using the same prior for data generation π’π―π₯ analysis is probably what makes it so well calibrated).
But I guess I now better understand your comment about him mixing Bayesian and frequentist ideas.
Thanks Frank. Reading an earlier chapter of yours, I think I found the agreement with John: the proportion of wrong conclusions among trials stopped early. The difference is in the magnitude of the phenomena - John's is much higher (~20% for 95% HDI).
18.01.2026 09:14 β π 0 π 0 π¬ 1 π 0the related section can be found here, in case you think it's something in the details: nyu-cdsc.github.io/learningr/as...
17.01.2026 14:36 β π 0 π 0 π¬ 0 π 0
I don't think he avoids the posterior mean - the stopping criterion I mentioned above is when the HDI is beyond some ROPE threshold (3rd column in figure), and the HDI is calculated from the posterior distribution.
I also think calibration is bad when falsely rejecting the null (2nd row 3rd column)
so do you believe this bias is a result of Kruschke using a uniform prior in this example instead of something more regularizing?
16.01.2026 14:56 β π 0 π 0 π¬ 1 π 0The fourth panels of Figures 13.4 and 13.5 show the 95% HDIs at every flip, assuming a uniform prior. The y-axis is ΞΈ, and at each N the 95% HDI of the posterior, p(ΞΈ|z, N ), is plotted as a vertical line segment. The dashed lines indicate the limits of a ROPE from 0.45 to 0.55, which is an arbitrary but reasonable choice for illustrating the behavior of the decision rule. You can see in Figure 13.4 that the HDI eventually falls within the ROPE, thereby correctly accepting the null value for practical purposes
That's a fair point and Kruschke does seem to focus on the observed proportion of heads ("z/N"). But to the best of my understanding the HDI πͺπ΄ calculated from the posterior distribution. so whether the HDI is outside the ROPE is a posterior-based measured, not an observed one.
15.01.2026 21:15 β π 0 π 0 π¬ 0 π 0is that really the case that simply changing the ROPE's bounds/thresholds (from MCID to nontrivial efficacy) debiases the decision?
15.01.2026 21:08 β π 0 π 0 π¬ 1 π 0Definitions Delta: true treatment effect being estimated delta: minimum clinically important Delta gamma: minimum detectable Delta (threshold for non-trivial treatment effect, e.g. 0.3delta) SI: similarity interval for Delta, e.g., -0.5delta, 0.5delta N: average sample size, used for initial resource planning. This is an estimate of the ultimate sample size based on assuming Delta=delta. N is computed to achieve a Bayesian power of 0.9 for efficacy, i.e., achieving a 0.9 probability at a fixed sample size that the probability of any efficacy exceeds 0.95 while the probability of non-trivial efficacy exceeds 0.85, i.e., Pr(Delta>0) and Pr(Delta>gamma)>0.85.
thanks Frank, it's the first time i'm hearing about non-trivial efficacy as a distinct concept that is not a different wording for MCID. from your article I see you're using interchangeably with minimum detectable effect.
anyhow, in the article it seems that NTE is just scaled MCID.
very good chance I might've misunderstood or misinterpreted your writings or Kruschke's, but I would highly appreciate if you could please try to clarify.
(it is a genuine question I've been having for a while now bsky.app/profile/ehud...)
Stopping when HDI is inside / outside the ROPE leads to some false positives when the null is true (wrong type 1 assertion), while correctly rejecting the null consistently when it is false (no wrong type 2 assertions). Meanwhile, using only the width of the HDI as the selection criterion leads to no wrong assertions (although you can stay undecided for very large Ns when the null is true, instead of accepting it).
The key point is this: If the sampling procedure, such as the stopping rule, biases the data in the sample then the estimation can be biased whether itβs Bayesian estimation or not. A stopping rule based on getting extreme values will automatically bias the sample toward extreme estimates, because once some extreme data appear by chance, sampling stops. A stopping rule based on precision will not bias the sample unless the measure of precision depends on the value of the parameter (which actually is the case here, just not very noticeably for parameter values that arenβt very extreme).
Frank, genuine question, any decision (e.g "stop for inefficiency") can have false positives. it is unintuitive to me that selecting on sign+magnitude of the effect leads to no selection bias. This assertion is also at odds with Kruschke's book (13.3.2) showing ROPE-based criteria biasing the effect
14.01.2026 19:26 β π 1 π 0 π¬ 2 π 0I don't fully understand the ontological comment, tbh. But I do think counterfactuals can be defined without setting (hypothetical) experiments, which I think you're alluding to? imo, if you can solve it with parallel universes, you can solve it with counterfactuals; RCTs are for mortals.
05.01.2026 20:52 β π 1 π 0 π¬ 0 π 0I bet he did at least a few, but didn't need much to generalize because he used strong priors so experimental N could be kept low π
05.01.2026 20:12 β π 2 π 0 π¬ 1 π 0