#Archaea, #DPANN, #phylogenetic_reconciliation
New preprint online!
www.biorxiv.org/content/10.1...
@stephkoe.bsky.social
Former PostDoc at Wageningen University with Thijs Ettema Studying ancient evolutionary transitions in prokaryotes using phylogenomics and structural modeling Looking for next step in academic or translational research he/him
#Archaea, #DPANN, #phylogenetic_reconciliation
New preprint online!
www.biorxiv.org/content/10.1...
Phylogenetic reconciliation supports a methanogenic ancestor of the Archaea and a derived origin for host-associated lineages https://www.biorxiv.org/content/10.1101/2025.11.11.687807v1
12.11.2025 23:32 — 👍 6 🔁 3 💬 0 📌 0We are looking for a PhD student to work on an exciting plastid endosymbiosis in microbial eukaryotes. This position involves sampling, exciting microscopy such as CARDFISH, ExM and FIBSEM, single-cell transcriptomics and more. #protistsonsky 1/2
12.11.2025 09:50 — 👍 76 🔁 66 💬 2 📌 2🧙♀️ Something is brewing in the WitChi cauldron…
After some excellent peer review feedback, a new update of WitChi is taking shape, refining how we detect and prune compositional bias in phylogenomic alignments.
Stay tuned for the next release!
🧙♀️ Can’t model it? Prune it!
github.com/stephkoest/w...
Congratttttttts!!!! So well deserved :)
23.10.2025 13:14 — 👍 1 🔁 0 💬 1 📌 0Glad to share our paper out today @NatureEcoEvo: “Serial innovations by Asgard archaea shaped the DNA replication machinery of the early eukaryotic ancestor”. www.nature.com/articles/s41... #microsky #archaeasky
21.10.2025 15:05 — 👍 63 🔁 28 💬 5 📌 4#MicroSky #mevoSky
22.08.2025 09:30 — 👍 4 🔁 0 💬 0 📌 0If you want to mess around with some motifs, check out:
github.com/stephkoest/E...
Never thought I’d do real E. coli research, so far I used if only for cloning & protist snacks in my MSc 😅
But here I am simulating chromosomes & shuffling motifs.
Watching @loreoliv.bsky.social and the lab turn those predictions into real data was pure magic.✨
Grateful to be part of this team!
Folddisco finds similar (dis)continuous 3D motifs in large protein structure databases. Its efficient index enables fast uncharacterized active site annotation, protein conformational state analysis and PPI interface comparison. 1/9🧶🧬
📄 www.biorxiv.org/content/10.1...
🌐 search.foldseek.com/folddisco
It was a pleasure to work with you! 😊
28.07.2025 09:17 — 👍 2 🔁 0 💬 0 📌 0Hey Felix, good question! Yeah it is different. In short: trimAl/ClipKit aim to remove uninformative sites. WitChi is a second step to remove misleading sites: those that can group unrelated taxa just because their sequence composition looks similar. Hope that helps!
21.07.2025 16:11 — 👍 3 🔁 0 💬 0 📌 0📢 New preprint alert!
We used comparative genomics on 72,000+ bacterial genomes to uncover the genetic basis of microbial adaptation to multicellular hosts—plants and animals alike.
www.biorxiv.org/content/10.1...
Thanks, jolien! ;)
21.07.2025 06:26 — 👍 0 🔁 0 💬 0 📌 0And of course the great work!
20.07.2025 17:02 — 👍 1 🔁 0 💬 0 📌 0Thanks for that beautiful summary, kassi :)
20.07.2025 17:01 — 👍 1 🔁 0 💬 1 📌 0Excited to share our work on WitChi! 🛠️🖥️
We tested it on the GTDB r220 archaeal supermatrix (5,869 taxa & 10,101 cols) removing 55% of sites in <2h.
The phylogeny showed several interesting groupings with overall improved branch support:
#phylogenetics #ArchaeaSky #MSA #opensource #MEvoSky #MicroSky
I'm happy to announce the latest release of the GlobDB, available at globdb.org.
The GlobDB is a database of "species dereplicated" microbial genomes, and as of release 226 contains twice the number of species-representative genomes (306,260) than the latest GTDB release.
It was a fun project :) Thanks for the support!
20.07.2025 13:23 — 👍 1 🔁 0 💬 0 📌 09. TL;DR + link dump
WitChi is:
✔ Fast
✔ Interpretable
✔ Tree- and model-free
✔ Benchmark-validated
Designed to fix compositional bias at phylogenomic scale.
With: @kassipan.bsky.social, @danieltamarit.bsky.social, @ettema.bsky.social
💻 github.com/stephkoest/w...
📄 www.biorxiv.org/content/10.1...
8.
GTDB r220 case study (led by @kassipan.bsky.social )
Applied WitChi to the archaeal GTDB r220 supermatrix:
• 5,869 taxa
• 55% of columns pruned
• Biased taxa: 95.1% → 2.3%
• Runtime: <2h on 4 cores
→ Known clades recovered — without using very complex C60 or CAT models
7.
Use witchi test to quantify bias per taxon:
• χ² scores
• Empirical p-values (via permutations)
• Z-scores to see how far taxa deviate from expectation
→ Great for screening MSAs or comparing compositional distortion across datasets.
6.
WitChi solves both problems:
🔹 Builds a null distribution using column permutations — no model, no tree
🔹 Recursively removes columns that distort the taxon-wise χ² profile
🎁 Bonus: 3 scoring strategies, including one capturing distribution-wide effects (Wasserstein)
⚡ Scales linearly with taxa
5.
Classical χ² pruning trims biased columns once — fast, but naive.
→ As alignment composition shifts, Δχ² must be updated — few tools do this.
BMGE’s stationary-based algorithm prunes iteratively and works well, but scales quadratically with taxa — not feasible for medium sized or large datasets.
4.
The problem:
χ² assumes taxa are independent and identically distributed samples.
In MSAs, they share history → correlated data.
So parametric χ² nulls are invalid.
Simulations help, but they need known models and trees — which bias distorts.
→ Slow, circular, rarely used.
3.
We often use χ² stats to detect bias — how much a taxon’s sequence composition deviates from expectation.
χ² pruning removes columns with strong bias signal.
But both steps rely on assumptions that don’t hold in real MSAs.
2.
What’s compositional bias?
When unrelated taxa convergently evolve similar sequence compositions (e.g. GC-rich, AT-rich), tree algorithms may group them by chemistry, not ancestry — a well-known artefact in deep phylogenies.
Fig modified from: doi.org/10.1007/978-...
1.
🧵 New preprint out!
WitChi: a fast, open-source Python tool to detect, quantify & prune compositional bias in MSAs.
Lightweight, tree-free, scalable to 5k+ taxa... so we applied it to the GTDB archaea MSA.
#ArchaeaSky #MEvoSky #MicroSky
🔗 doi.org/10.1101/2025...
💻 github.com/stephkoest/w...
Exciting new research! Matthias Horn & Lukas Helmlinger from #CeMESS uncover how chlamydiae thrive in social amoebae by skipping their extracellular stage.
Read their groundbreaking study in Current Biology (@currentbiology.bsky.social):
🔗 cemess.univie.ac.at/news/detail-...
Excited to share our first paper on the symbiosis between chlamydiae and social amoebae showing in detail the adaptations of endosymbionts to the social life style of their host! 🎉
09.07.2025 13:03 — 👍 32 🔁 15 💬 0 📌 0