Stephen Burgess's Avatar

Stephen Burgess

@stevesphd.bsky.social

Medical statistician, work with genetic data to disentangle causation from correlation. Author of book on Mendelian randomization.

661 Followers  |  167 Following  |  86 Posts  |  Joined: 16.11.2024  |  2.0517

Latest posts by stevesphd.bsky.social on Bluesky

Thanks to @amymariemason and @BarWoolf
for working on this together, and to @ChatGPTapp
for helping to get the ball rolling with the writing, even if we overruled you in many places!

22.07.2025 11:18 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

...but the initial text needed a lot of work - it struggled to synthesize the ideas, and the structure was not great. Maybe a better prompt? Some of the ideas we seeded in the prompt ended up less important in the eventual submission.

22.07.2025 11:18 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

But it did cut down the overall writing time - I would estimate by around 50%. This is a topic that has been in my head for several years, and I don't think I would have got round to writing it otherwise. It was much better at writing the abstract and cover letter...

22.07.2025 11:18 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

To be honest, I was a bit disappointed with the draft - in particular, the simulation study was incorrect and quite limited in scope (we hoped it would do well with this). We ended up re-writing large chunks of text, although some vestiges remain in the final submission.

22.07.2025 11:18 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

A subtext to this work is that it is the first manuscript I've written where the first draft was generated by ChatGPT - we used the Deep Research function. The AI prompt is in the appendix, and we will share the full machine-written draft (pre-edits) with the community.

22.07.2025 11:18 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

...as for context stratification, the subgroups differ based on other factors by definition - as they come from different centres. In conclusion, the idea may work in some cases, but even when it does, it is somewhat limited in scope and interpretation.

22.07.2025 11:18 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Additionally, differences in centre-stratified estimates may occur for a variety of reasons, including non-linearity, but also other differences between centres. The same is true for other stratification methods, but potentially worse for context stratification...

22.07.2025 11:18 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Figure 2

Figure 2

However, the separation between mean exposure levels in centres is far less than between subgroups defined by the residual-based or doubly-ranked method, allowing us to consider non-linearity over a much narrower range.

22.07.2025 11:18 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Figure 1 right panel

Figure 1 right panel

We can perform MR analyses in each centre, obtaining context-stratified MR estimates that can be analysed using a heterogeneity test or trend test (i.e. meta-regression).

22.07.2025 11:18 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Table 2 from manuscript

Table 2 from manuscript

An alternative is to stratify on existing structure in the data, such as recruitment centres. For instance, in UK Biobank, average vitamin D levels differ across centres - higher in the south-west, lower in Scotland.

22.07.2025 11:18 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

A naive approach is to stratify on the exposure directly. But this induces bias, as the exposure is a collider of the IV and exposure-outcome confounders. Alternative approaches can work (residual-based and doubly-ranked methods), but rely on untestable assumptions.

22.07.2025 11:18 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Several existing approaches for non-linear Mendelian randomization stratify the population into subgroups and perform separate analyses in these subgroups. But constructing subgroups such that the IV assumptions hold in the subgroups is tricky.

22.07.2025 11:18 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
Context-stratified Mendelian randomization: exploiting regional exposure variation to explore causal effect heterogeneity and non-linearity Mendelian randomization (MR) uses genetic variants as instrumental variables to make causal claims. Standard MR approaches typically report a single population-averaged estimate, limiting their abilit...

New pre-print online: "Context-stratified Mendelian randomization: exploiting regional exposure variation to explore causal effect heterogeneity and non-linearity", arxiv.org/abs/2507.11088. We propose an alternative approach to assess non-linearity in Mendelian randomization.

22.07.2025 11:18 β€” πŸ‘ 5    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0

Thanks to Nasir and Eduardo for leading this work, and to @bar_woolf for contributing - was great to think together about the value of these multiverse analyses and how to interpret!

18.07.2025 11:46 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Research like this cannot indicate the reliability of findings, but it can indicate the consistency and sensitivity of findings, and potentially highlight covariates whose adjustment has a big effect on estimates - meaning we need to be confident in our adjustment choice for that covariate.

18.07.2025 11:46 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Worse still is when a Janus effect is combined with p-hacking. A Janus effect means that it is often possible to find a plausible sounding set of covariates that leads to whatever estimate you want - whether positive, null, or negative.

18.07.2025 11:46 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

If associations are consistent in direction, this gives confidence in periodontitis as a risk factor - but not absolute proof. Similarly, if inconsistent, this does not rule out periodontitis as a risk factor - but it means any causal claim is sensitive to our assumptions.

18.07.2025 11:46 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Moderate periodontitis is consistently negatively associated with cognitive function, but its association with cardiovascular disease depends on the selection of covariates (mostly positive, but some negative - Janus effect). How to interpret this?

18.07.2025 11:46 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Plots below represent the association of moderate periodontitis with cardiovascular disease (left) and cognitive function (right) for 393,216 different covariate adjustment options. The presence of positive and negative estimates is sometimes known as a Janus effect.

18.07.2025 11:46 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Dental epidemiology is full of claims of the harmful effect of oral health on various diseases based on observational research. But how reliable are these claims when they depend on covariate adjustment to mitigate against confounding?

18.07.2025 11:46 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
Sage Journals: Discover world-class research Subscription and open access journals from Sage, the world's leading independent academic publisher.

New paper led by Nasir Bashir "Periodontitis and Systemic Disease: The Impact of Covariate Selection" published at @JDentRes:
journals.sagepub.com/doi/10.1177/.... We investigate how disease associations vary with covariate adjustment.

18.07.2025 11:46 β€” πŸ‘ 6    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Previously much of the discipline involved "shoe-leather epidemiology", now epidemiology can be practised without leaving the keyboard. How do we marry data analysis with applied experience? It requires a partnership that typically goes beyond the ability of any one individual.

15.07.2025 13:52 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Epidemiology: a partnership of technical expertise and clinical insight Like all medical fields, epidemiology is a discipline that can be practiced effectively only in partnership with others. While data analysis may seem like a solitary pursuit, epidemiological datasets ...

New publication at @BMJMedicine "Epidemiology: a partnership of technical expertise and clinical insight" co-authored with Debbie Lawlor: bmjmedicine.bmj.com/content/4/1/.... This essay explores how to be an epidemiologist is in today's world. Comments welcome!

15.07.2025 13:52 β€” πŸ‘ 5    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0

Thanks again to Julie for leading this work as a visiting PhD student from @HKU_SPH, and to ToinΓ©t CronjΓ© and Mary Schooling for providing advice and support!

10.06.2025 08:55 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

More specifically, it may not be possible to perform reliable MR investigations to learn about the effects of increasing ketone metabolism. Some epidemiological questions cannot be answered reliably using the MR framework.

10.06.2025 08:55 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

The wider contribution of this paper is to provide a stepwise framework for selecting instruments for a complex trait based on biological considerations that is transparent and can be pre-specified. In practice, analysts typically should present results with different instrument choices.

10.06.2025 08:55 β€” πŸ‘ 1    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0

Maybe we were too strict? Maybe the positive controls are not true consequences of ketone metabolism? In which case, we maybe have two gene regions with variants that are potential instruments. We can never be 100% sure about instrumental variable assumptions in any case.

10.06.2025 08:55 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

But one of these gene regions (SLC2A4) had pleiotropic associations with blood pressure, and the other two (HMGCS2, OXCT1) did not have clear and consistent associations with positive control variables.

10.06.2025 08:55 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

We searched for such variants, and... struggled to find any. We found several variants associated with acetone at p<5x10-6, four of which were in relevant gene regions. Three of the four variants had concordant associations with all three primary ketone bodies, ...

10.06.2025 08:55 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

We set four criteria that we believe a plausible instrument for ketone metabolism should satisfy: (1) location in a relevant gene region, (2) association with all three primary ketone bodies, (3) no pleiotropic associations, (4) associations with positive control variables.

10.06.2025 08:55 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

@stevesphd is following 20 prominent accounts