(one issue is that we don't have exact dates of de- and re- registration; we have number of days in a calendar year registered, and first and last days of registration in that year. So sometimes we can infer the dates, and sometimes not. We do, however, have exact days of events coded by GPs)
07.03.2026 09:20 โ
๐ 0
๐ 0
๐ฌ 0
๐ 0
If they de-register from Welsh NHS, they "disappear" for us until they re-register with a Welsh GP (e.g., they move to England for Uni, then come back). We do not have records outside of Wales. Yes, a person an experience A and B multiple times (so for common codes, bias is less).
07.03.2026 09:14 โ
๐ 0
๐ 0
๐ฌ 1
๐ 0
But we have 15 years of data so I did a simulation adding gaps (modelled on the incomplete data gaps) to complete data to estimate the bias, as a first go
06.03.2026 14:53 โ
๐ 1
๐ 0
๐ฌ 0
๐ 0
Yeah, thatโs the issue; we have Welsh NHS data but they might (say) move temporarily to England for University
06.03.2026 14:51 โ
๐ 0
๐ 0
๐ฌ 1
๐ 0
Thatโs very generous. OK to email?
06.03.2026 12:48 โ
๐ 0
๐ 0
๐ฌ 1
๐ 0
Yes, clinical codes (so โGP reports that X talked about Yโ)
06.03.2026 11:50 โ
๐ 0
๐ 0
๐ฌ 1
๐ 0
> but I'm less familiar with this area, so I'm not sure the "best" approaches. Done matching of time periods, restriction to complete data (loses data and for more fatal diseases causes a different kind of bias!), and trying to estimate bias via simulation. If it helps this is population-level data.
06.03.2026 09:24 โ
๐ 6
๐ 0
๐ฌ 1
๐ 1
To my health researcher colleagues: when you're looking at routinely collected data (over many years) and you're interested in the relationship b/w two kinds of events (say, diagnosis for diseases A and B), how do you best deal with time gaps (eg people moving)? Obviously these introduce time bias >
06.03.2026 09:24 โ
๐ 8
๐ 4
๐ฌ 2
๐ 0
— !
05.03.2026 20:36 โ
๐ 2
๐ 0
๐ฌ 0
๐ 0
antirealists intensify
05.03.2026 13:05 โ
๐ 2
๐ 0
๐ฌ 1
๐ 0
I'm not linking to yet another White Man On The Internet's blog post, but suffice to say:
No, LLM extruders *can't* do ethnographic research or social semiotic analysis
What they *have done* is flood the zone with slop that makes social science research & analysis infinitely harder than before
03.03.2026 14:38 โ
๐ 59
๐ 12
๐ฌ 1
๐ 2
Ring ring! Open the door, let me through! The Fish Doorbell
Every year in spring, a lovely side of humanity shows in the form of a fish doorbell in Utrecht, Netherlands, but what is it and what does it do?
Every year in spring, a lovely side of humanity shows in the form of a fish doorbell in Utrecht, Netherlands, but what is it and what does it do?
ย |ย Anna Schurer
02.03.2026 06:17 โ
๐ 46
๐ 25
๐ฌ 0
๐ 5
Discussion paper meetings
Next RSS discussion meeting has picked a great topic:
'Regression by composition' by Daniel Farewell, Rhian Daniel et al.
Tuesday, 24 March 2026; Imperial College London, and online.
Time: 4pm to 6pm (UK time)
Introductory DeMO 2:15pm to 3:15pm
Registration here: rss.org.uk/training-eve...
02.03.2026 12:47 โ
๐ 4
๐ 2
๐ฌ 0
๐ 0
Glob-glob! Absolutely magical sea anemone larva (about 7 mm) from The Lombok Strait, Indonesia ๐ฎ๐ฉ
01.03.2026 13:11 โ
๐ 1027
๐ 299
๐ฌ 14
๐ 21
Update: I spent seven tries just to get it to change an uppercase delta to a lowercase one. It kept โfixingโ the problem (still uppercase!). After seven tries, it finally just substituted a โhand-craftedโ SVG path (!). (I know this is easy to fix myself; wanted to
see if I could do it with prompts)
28.02.2026 21:19 โ
๐ 7
๐ 1
๐ฌ 0
๐ 1
every statement from schumer is like the villager in zelda whoโs 300 feet away from an evil towering castle going โganon? never heard of him. boy I wish someone would round up 10 chickens to put back in my coopโ
28.02.2026 18:09 โ
๐ 626
๐ 130
๐ฌ 22
๐ 4
Yeah, given the extreme cost of the infrastructure just to maybe cut out some boring boilerplate coding? Doesnโt seem worth it.
28.02.2026 18:35 โ
๐ 1
๐ 1
๐ฌ 0
๐ 0
Yeah, I really can't stress enough that AI code assistants royally fuck up statistical analyses. And they do it with absolute confidence.
28.02.2026 17:38 โ
๐ 26
๐ 5
๐ฌ 1
๐ 2
I'm specifically testing claims of people I know re: LLMs using a test case that I've already coded myself. I don't think it has streamlined the work, really. It does boilerplate fine, but then I have to check any serious code. Not a net gain (but a change in the nature of the work)
28.02.2026 16:49 โ
๐ 12
๐ 2
๐ฌ 1
๐ 1
Trying to one-shot this app has been a disaster - there are a bunch of interconnected animations/interface dependencies between parts of the app that it just screws up. Building it one logical step at at time seems to work better. I've tried it three times in various ways.
28.02.2026 16:43 โ
๐ 3
๐ 0
๐ฌ 1
๐ 0
I *think* so - but all credit to Travis on that, it was his vision.
28.02.2026 12:10 โ
๐ 2
๐ 0
๐ฌ 1
๐ 0
it was my pleasure to receive and edit a nice paper. Travis Proulx and I are pleased at how that whole special issue turned out.
28.02.2026 12:05 โ
๐ 4
๐ 0
๐ฌ 1
๐ 0
Also, sometimes a change will completely hose the app. The order in which I ask it to add features is very important (in ways that seem logical given that it doesn't...know...anything, but still - if you thought this was anything like asking a person to complete a task you wouldn't understand why.
28.02.2026 09:48 โ
๐ 18
๐ 0
๐ฌ 0
๐ 0
> d3.js code with animations, but "vibe coding" scientific apps is likely to be very dangerous. I would need to completely gut this app of the backend stats functions and substitute my own.
28.02.2026 09:22 โ
๐ 44
๐ 5
๐ฌ 3
๐ 0
And, in fact, in my own code I use this fast *approximation* for animation speed before replacing it with the correct answer after the animation stops. But here it has simply used it has the "correct" answer. Of course, this doesn't mean that using Claude might not have save me time writing >
28.02.2026 09:22 โ
๐ 18
๐ 1
๐ฌ 1
๐ 0
In case you don't immediately see the issue, this is in the context of power for a t test, and it simply calls the normal CDF in the function that is supposed to be the noncentral t CDF. If I didn't know I might be fooled because this will, indeed, give close-to-correct power values for large N.
28.02.2026 09:22 โ
๐ 31
๐ 2
๐ฌ 2
๐ 1
Javascript code for a noncentral t cdf produced by Claude. It is simply a call to the normal CDF, which is not correct (though will be a decent approximation with large N).
I've been testing Claude to see how well it can "vibe out" a stat. power app that I've already coded completely myself - so I know what I want. It mostly gets things right with animations (those are easily verifiable) but looking into the backend stats code is nightmare inducing (see pic).
28.02.2026 09:22 โ
๐ 79
๐ 23
๐ฌ 4
๐ 11
This is a fast-changing space! I moved from R+shiny to R+custom javascript and now I'm thinking for most purposes, R+Quarto+webr is the way to go, plus whatever distributed Rstudio installation you have (I've used posit cloud, but that's quite expensive now)
27.02.2026 13:30 โ
๐ 1
๐ 0
๐ฌ 2
๐ 0
donโt skip leg day
27.02.2026 08:16 โ
๐ 7
๐ 0
๐ฌ 0
๐ 0