Stephen Burgess's Avatar

Stephen Burgess

@stevesphd.bsky.social

Medical statistician, work with genetic data to disentangle causation from correlation. Author of book on Mendelian randomization.

728 Followers  |  172 Following  |  186 Posts  |  Joined: 16.11.2024  |  1.8159

Latest posts by stevesphd.bsky.social on Bluesky


Non-linear MR did not show evidence for non-linearity for most outcomes. Where it did, it never suggested a non-monotone relationship - no J-shaped or U-shaped findings for any outcomes.

16.02.2026 11:23 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

We threw MR with alcohol as an exposure at a large number of exposures. Most came out supporting harmful effects, particularly for neurologic and behavioural, circulatory, and liver outcomes. Potential protective effects were for migraines and urinary calculus.

16.02.2026 11:23 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Redirecting

Great to be involved in this publication: "Phenome-wide study on alcohol consumption provides genetic evidence for a causal association with multiple diseases and biomarkers" led by Nigussie Kassaw - doi.org/10.1016/j.nu....

16.02.2026 11:23 β€” πŸ‘ 3    πŸ” 2    πŸ’¬ 1    πŸ“Œ 0

Keir Starmer isn't a person? (The quote isn't specifically relevant to the topic, but he is quoted in the piece!)

16.02.2026 11:11 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Great to see the first paper from James' PhD out as a pre-print - thanks to Ash for providing lots of support and help! Look forward to receiving comments!

16.02.2026 11:08 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

Using MSE, with lots of confounding (\rho>0.3), IV outperforms OLS at F lower than 10. With minimal confounding (\rho~0.1), the F threshold is higher. Using an F statistic to determine your analysis strategy is a bad idea in any case, but that's a story for another day.

16.02.2026 11:08 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Relative bias is not the best criterion to focus on - it gives a simple rule, but not a helpful rule. Mean squared error or mean absolute error give more helpful rules, but these rules will depend (correctly!) on the degree of confounding between the exposure and outcome.

16.02.2026 11:08 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

The F>10 threshold is independent of the amount of confounding between the exposure and outcome. But the choice between IV and OLS *should* be dependent on this - if there is lots of confounding, we would typically prefer IV. If no confounding, we prefer OLS.

16.02.2026 11:08 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

This is particularly due in the asymptotic limit under standard weak instrument asymptotics, where the OLS estimate tends to a point, but the IV estimate tends to a distribution. Relative bias only cares about the behaviour of the average of this distribution.

16.02.2026 11:08 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

The threshold comes from an expression for the relative bias of the IV estimate. But generally, we do not care about bias in isolation - we care about bias and variance together (e.g. in mean squared error). By focusing on bias, we implicitly say that IV is superior to OLS.

16.02.2026 11:08 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

New preprint: "Revising the use of F-tests in weak instrument practice: point estimation and beyond" led by James Lane: papers.ssrn.com/sol3/papers..... The F>10 threshold for avoiding weak instrument bias has almost mythical status. But where does it come from? And does it make sense? In brief:

16.02.2026 11:08 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Feedback is welcome as ever! Thanks to @angzhou.bsky.social for leading this work, and to Haodong, Ash, @amymariemason.bsky.social, Emma, and Elina for input!

26.01.2026 12:18 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

This makes a substantial difference to estimates for LDL-cholesterol, and a detectable but much smaller difference to estimates for BMI and vitamin D. The obvious limitation is this only holds for GxE interactions we can measure and account for.

26.01.2026 12:18 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

If we subtract the GxE interaction from the exposure, then we can stratify on this corrected exposure value. This correction is only necessary in the stratification step; the estimation can proceed using the uncorrected exposure values.

26.01.2026 12:18 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

For instance, genetic associations with 25(OH)D levels (a biomarker of vitamin D status) are larger in the summer and smaller in the winter, and genetic associations with several traits differ between men and women, and with socioeconomic markers.

26.01.2026 12:18 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

However, this assumption can also be violated. Enter Ang's manuscript! Ang shows that if we can model the heterogeneity in the genetic effect on the exposure, then we can correct for this heterogeneity in the doubly-ranked method.

26.01.2026 12:18 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

This is a strictly weaker assumption than the constant genetic effect assumption, in that it allows the magnitude of the genetic effect on the exposure to vary, but it still requires some degree of homogeneity in the genetic effect on the exposure.

26.01.2026 12:18 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

We developed a second method (doubly-ranked method) which makes a strictly weaker assumption that the ordering of individuals' exposure values would be the same if their genetic instrument were fixed to take any value (rank preserving assumption).

26.01.2026 12:18 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

While we assessed sensitivity to this assumption in the original methods paper, violations of this assumption in practice are stronger than we assessed, and realistic violations of the assumption can lead to substantial bias in practice.

26.01.2026 12:18 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Our original method for non-linear Mendelian randomization (residual-stratified method) made a strong and unrealistic assumption that the effect of genetic variants on the exposure is constant for all individuals in the dataset.

26.01.2026 12:18 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

All statistical methods make assumptions (and those assumptions are inevitably always violated), but the extent to which they are violated and the impact of that violation on estimates is often unclear.

26.01.2026 12:18 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
Correcting for effect modification in the doubly-ranked non-linear Mendelian randomization method The doubly-ranked non-linear Mendelian randomization method can yield biased estimates when instrument strength varies across individuals due to gene-environment (GxE) interactions. We propose a simpl...

New pre-print: "Correcting for effect modification in the doubly-ranked non-linear Mendelian randomization method" led by Ang Zhou available at www.medrxiv.org/content/10.6.... Brief thread:

26.01.2026 12:18 β€” πŸ‘ 2    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0
Post image Post image Post image Post image

LD patterns can make it difficult to select optimal instrumental variables for Mendelian randomization studies. @stevesphd.bsky.social & co of @hggadvances.bsky.social 's latest article evaluate the ability of four selection methods to increase instrument strength: bit.ly/4a0yXih #ASHG

22.01.2026 21:17 β€” πŸ‘ 5    πŸ” 3    πŸ’¬ 0    πŸ“Œ 0

Thanks to Benji and others (@amymariemason.bsky.social, Chin Yang, Hyunseung, Hannah, and Marcus) for working on this! Great to see this published!

21.01.2026 12:23 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

We use these methods to estimate the effect of offspring smoking conditional on parental smoking behaviour - we saw some evidence for a direct effect of parental smoking status on offspring smoking status, although with wide confidence intervals in several methods.

21.01.2026 12:23 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

We present various methods that can be used in this setting depending on the format of data available (individual-level or summarized), who you have data on (both parents or one parent), and the assumptions (is assortative mating likely?).

21.01.2026 12:23 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

The idea of this work is not only to exploit randomness in whether you inherit a genetic variant, but also in whether you do not inherit a genetic variant from a parent. This enables not only the estimation of the effect of an exposure, but the direct effect of an exposure.

21.01.2026 12:23 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

All models are wrong and all instruments are invalid, but randomness inherent in how genetic variants are inherited means that genetic variants are often plausible instruments, particularly in within-family settings.

21.01.2026 12:23 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

New paper: "Extending the Use of Mendelian Randomisation With Non-Inherited Variants to Assess Socially Transmitted Parental Exposures Under Assortative Mating" published at Genetic Epidemiology and led by Benji Woolf: onlinelibrary.wiley.com/doi/10.1002/.... Brief thread:

21.01.2026 12:23 β€” πŸ‘ 5    πŸ” 3    πŸ’¬ 1    πŸ“Œ 0

In summary, there are different methods for variant selection in cis-MR - they may result in more precise estimates, but: 1) always benchmark against the lead-variant estimate, and 2) biological considerations generally trump considerations - more relevant > more precise!

19.01.2026 12:24 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

@stevesphd is following 20 prominent accounts