Thanks to all of our SMaHT colleagues and especially to @sedlazeck.bsky.social who led the hackathon which spawned the prototype of this pipeline!
09.12.2025 18:06 β π 0 π 0 π¬ 0 π 0@egatkinson.bsky.social
Population and statistical genomicist working to make genomics fully representative. Views are my own. (she/her)
Thanks to all of our SMaHT colleagues and especially to @sedlazeck.bsky.social who led the hackathon which spawned the prototype of this pipeline!
09.12.2025 18:06 β π 0 π 0 π¬ 0 π 0MosaicSim offers a realistic, scalable approach for assessing detection limits, with immediate applications to large sequencing efforts including those within the SMaHT Network, which was the springboard for this work.
09.12.2025 18:06 β π 0 π 0 π¬ 1 π 0A key (surprising) result was that ultra-high coverage (300Γβ450Γ) yields diminishing returns for mosaic variant detection. In many settings, 150Γ coverage performs comparably or better, highlighting opportunities for cost-effective study design.
09.12.2025 18:06 β π 1 π 0 π¬ 1 π 0Using MosaicSim, we benchmarked DRAGEN and found strong VAF- and depth-dependent performance limits. Sensitivity decreases sharply at low VAF, especially in complex genomic regions.
09.12.2025 18:06 β π 0 π 0 π¬ 1 π 0Detecting mosaic variants is challenging due to low VAFs and real sequencing noise. MosaicSim layers user-defined variants directly onto empirical WGS data, preserving true read-level properties while providing a controlled ground-truth set for benchmarking.
09.12.2025 18:06 β π 0 π 0 π¬ 1 π 0We are pleased to share our new preprint introducing MosaicSim, a framework for generating realistic mosaic variants! Mosaic variants - mutations present in only a subset of cells - are crucial for development, disease, and cancer, but are notoriously hard to call.
www.biorxiv.org/content/10.6...
A fun lab outing to the zoo ahead of conference season! π¦
12.10.2025 02:27 β π 2 π 0 π¬ 0 π 0So since we only include >0.1% MAF variants in this article we can't address ultrarare, but check out Supp Fig 3; when comparing ancestry-specific AFs many variants deviate from the 1:1 line. We plotted this on the logββ(AF) scale to help magnify the low-frequency range.
10.10.2025 15:32 β π 0 π 0 π¬ 0 π 0To limit the noise from ultra-rare alleles we only looked at variants β₯0.1% MAF. Totally appreciate that's still quite low frequency, but even with that filter, we still saw the noted ancestry-specific frequency differences.
10.10.2025 15:01 β π 0 π 0 π¬ 0 π 0Great point; we thought about that too! Pragati stratified by whether variants were monomorphic or not to capture at least that aspect, but youβre right that the impact depends on where a variant sits on the SFS. Rare ones can show big fold-changes but small absolute shifts.
10.10.2025 14:58 β π 0 π 0 π¬ 0 π 0Texas Children's/Baylor College of Medicine Researchers Create Groundbreaking Tool to Improve Accuracy of #GeneticTesting @egatkinson.bsky.social @bcmgenetics.bsky.social @bcmhouston.bsky.social #TCHResearchNews #TexasChildrens @natcomms.nature.com tinyurl.com/jj6kyrrv
06.10.2025 14:16 β π 6 π 1 π¬ 0 π 0Thrilled to share our new @natcomms.nature.com paper on local ancestry informed allele frequencies in gnomAD, which are live now on the browser! Check out my stellar PhD student @pragskore.bsky.socialβs Bluetorial on how this brings finer detail to variant interpretation π§¬π₯οΈ
06.10.2025 18:44 β π 14 π 4 π¬ 1 π 0A project many years in the process, weβre pleased to present our work on multi-ancestry meta-analysis across a boatload of traits in the UK Biobank: www.nature.com/articles/s41...
18.09.2025 17:25 β π 63 π 25 π¬ 1 π 0Delighted to amplify my talented PhD studentβs work! Check it out for a great way to streamline and harmonize Tractor analyses.
13.09.2025 00:47 β π 6 π 2 π¬ 0 π 0Thanks for the interest! The tutorial code is available to download as supplemental information of the paper, and has been deposited as a community workspace in the All of Us Researcher Workbench.
23.07.2025 15:05 β π 2 π 0 π¬ 0 π 0In summary, we present a replicable training model that empowers early-career researchers - including and especially those new to computational genomics - to responsibly leverage large-scale biobank data into their research programs and teaching.
22.07.2025 16:36 β π 0 π 0 π¬ 1 π 0From years 1β3, training outcomes reported by scholars to stem directly from this training included:
π 17 conference presentations
π¬ Multiple funded research grants
π Numerous genomics modules added in undergrad courses
π€ Sustained collaborations across institutions
During the summit, scholars used real short-read WGS data to:
β’ Prepare phenotypes & covariates
β’ Run GWAS via Hail
β’ Visualize results with PCA, Manhattan & QQ plots
β’ Manage compute costs
All in ~4 hours with no prior coding required.
Our training was part of the All of Us Biomedical Researcher Scholars Program through @bcmgenetics.bsky.social focused on mentoring early-stage faculty in genomic data science. The curriculum launches with an intensive Faculty Summit, where scholars get hands-on experience working with genomic data.
22.07.2025 16:36 β π 0 π 0 π¬ 1 π 0Access to big genomic data is growing, but parallel access to skills needed to use it hasnβt kept up.
We created an accessible, cloud-based genomic analysis training bootcamp using real All of Us data, Jupyter notebooks, and the Hail framework to lower the barrier for early-career researchers.
π¨ New perspective piece in @ajhgnews.bsky.social! π¨
We developed a hands-on training resource for large-scale genomic data analysis in the All of Us Researcher Workbench, now published here:
Tractor-Mix builds on Tractorβs strengths to detect ancestry-enriched signals while adding power and robust false-positive control for relatedness via a GRM. By modeling both admixture and relatedness, it overcomes key GWAS barriers and enables more accurate, representative genomic discovery.
09.06.2025 18:31 β π 2 π 0 π¬ 0 π 0Tractor-Mix uses ancestry-specific genotypes as predictors, outputting ancestry-specific effect sizes and P values. We benchmark our new tool in simulations and apply it to multiple admixed cohorts (including UKBiobank and Mexico City Prospective Study), uncovering signals missed by standard GWAS.
09.06.2025 18:31 β π 2 π 0 π¬ 1 π 0In this work, we introduce Tractor-Mix, a new GWAS method that extends Tractor to handle related admixed samples. It combines a mixed model framework (like GMMAT) with local ancestry-aware genotypes (like Tractor) in a 2 d.o.f. test.
09.06.2025 18:31 β π 2 π 0 π¬ 1 π 0As biobanks and global cohorts grow, so does the inclusion of admixed individuals with close or cryptic relatedness. This introduces the statistical challenge of two interwoven sources of stratification: admixture and relatedness, which are rarely handled together.
09.06.2025 18:31 β π 2 π 0 π¬ 1 π 0We previously developed Tractor, a local ancestry-aware GWAS method thatβs been widely used to uncover ancestry-enriched signals and refine genetic architecture in admixed populations. But Tractor (being a GLM) only works on unrelated samples, limiting its use in many real-world datasets.
09.06.2025 18:31 β π 2 π 0 π¬ 1 π 0We're excited to introduce Tractor-Mix, our new method for GWAS in admixed cohorts with relatedness, led by the fantastic @doubletaotan.bsky.social! Read the full preprint here: www.medrxiv.org/content/10.1...
Thanks to all our amazing collaborators who helped make this work possible!
Check out my stellar PhD student, Pragati's talk on our work generating local ancestry informed frequency estimates in gnomAD as part of the prestigious Emerging Genomic Scientist Symposium next week! Congrats on being selected for this amazing event!
09.04.2025 15:57 β π 5 π 1 π¬ 0 π 0I'm delighted to be part of this symposium, put on by University of Pennsylvania Perelman School of Medicine, and led by @bpasaniuc.bsky.social and @sarahtishkoff.bsky.social. See you in a few weeks! upenn.co1.qualtrics.com/jfe/form/SV_...
03.04.2025 14:48 β π 8 π 4 π¬ 0 π 2π Huge thanks to all our amazing LAGC collaborators! Special shoutout to Estela Bruxel and Diego Rovaris for leading this crucial work, and of course @janitzamontalvo.bsky.social and @giustilab.bsky.social for co-founding the LAGC and co-leading alongside myself. πͺ
02.04.2025 15:24 β π 2 π 1 π¬ 0 π 0