Dongwook Kim's Avatar

Dongwook Kim

@dongwookkim.bsky.social

Developing fast and easy methods for #phylogenetics and #bioinformatics | PhD in Bioinformatics | Postdoc @ Comparative Genomics Lab, UNIL/SIBπŸ‡¨πŸ‡­| Formerly @ Steinegger Lab, SNUπŸ‡°πŸ‡· | he/him

155 Followers  |  89 Following  |  4 Posts  |  Joined: 21.11.2024  |  1.8691

Latest posts by dongwookkim.bsky.social on Bluesky

Preview
Planetary microbiome structure and generalist-driven gene flow across disparate habitats Microbes are ubiquitous on Earth, forming microbiomes that sustain macroscopic life and biogeochemical cycles. Microbial dispersion, driven by natural processes and human activities, interconnects mic...

Our new preprint is out!
www.biorxiv.org/content/10.1...
In this study, we present the largest systematic analysis of microbiome structure and function, integrating 85K uniformly processed metagenomes from diverse habitats worldwide.
@podlesny.bsky.social @jonas-bio.bsky.social @borklab.bsky.social

21.07.2025 11:56 β€” πŸ‘ 28    πŸ” 17    πŸ’¬ 1    πŸ“Œ 3

OrthoFinder just dropped a major update

It’s faster, more accurate, and ready for thousands of genomes

Let’s break it down (1/10)

github.com/OrthoFinder/...
www.biorxiv.org/content/10.1...

16.07.2025 17:51 β€” πŸ‘ 126    πŸ” 71    πŸ’¬ 1    πŸ“Œ 1
Video thumbnail

Folddisco finds similar (dis)continuous 3D motifs in large protein structure databases. Its efficient index enables fast uncharacterized active site annotation, protein conformational state analysis and PPI interface comparison. 1/9🧢🧬
πŸ“„ www.biorxiv.org/content/10.1...
🌐 search.foldseek.com/folddisco

07.07.2025 08:21 β€” πŸ‘ 148    πŸ” 70    πŸ’¬ 8    πŸ“Œ 3
Preview
A general substitution matrix for structural phylogenetics. Abstract. Sequence-based maximum likelihood (ML) phylogenetics is a widely used method for inferring evolutionary relationships, which has illuminated the

New paper from the lab from Sriram Garg in my group. We introduce a general substitution matrix for structural phylogenetics. I think this is a big deal, so read on below if you think deep history is important. academic.oup.com/mbe/advance-...

11.06.2025 14:01 β€” πŸ‘ 93    πŸ” 52    πŸ’¬ 3    πŸ“Œ 2

This work was done by talented @sukhwanpark.bsky.social and me, supervised by amazing @martinsteinegger.bsky.social !

Try Unicore now πŸ‘‰ conda install -c bioconda unicore
Code and tutorial: 🌐 github.com/steineggerlab/unicore
Manuscript: 🌐 doi.org/10.1093/gbe/evaf109

03.06.2025 06:54 β€” πŸ‘ 5    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

Unicore is fast, accurate, and universal. Unicore reconstructed consistent phylogeny of bacterial/fungal species, while maintaining linear time scale over the input size. Besides, Unicore works with any given taxa, presenting scalable and universal method for structure-based phylogeny. 🧡3/n

03.06.2025 06:54 β€” πŸ‘ 6    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

With Unicore, we identified 13 structural core genes from 166 species across the Tree of Life, where 8 of them could only be defined using structures. Projected on the Tree of Life reconstructed with Unicore, you can see the universally conserved structure of one of the structural core genes. 🧡2/n

03.06.2025 06:54 β€” πŸ‘ 5    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Unicore is now published on GBE πŸš€
Unicore rapidly identifies structural single-copy core genes from input species proteomes for phylogenetic analysis. Powered by Foldseek and ProstT5, Unicore enables linear-scale structure-based phylogeny of any given set of taxa. 🧡1/n
πŸ“ƒ doi.org/10.1093/gbe/evaf109

03.06.2025 06:54 β€” πŸ‘ 68    πŸ” 31    πŸ’¬ 3    πŸ“Œ 2
Post image

AFESM: a metagenomic guide through the protein structure universe! We clustered 821M structures (AFDB&ESMatlas) into 5.12M groups; revealing biome-specific groups, only 1 new fold even after AlphaFold2 re-prediction & many novel domain combos. 🧡
🌐 afesm.foldseek.com
πŸ“„ www.biorxiv.org/content/10.1...

27.04.2025 00:13 β€” πŸ‘ 141    πŸ” 71    πŸ’¬ 4    πŸ“Œ 4
Post image

Visit our posters at #RECOMB2025 for:

Structural: MSAs, Virus DB, Core Genes, Motif Discovery, Multimer Clustering & Search, pLM Foldseek, Environmental analysis

Metagenomics: Classification & Metabuli App

GPU-based & RNA search, Proteome clustering, Novel Ribozyme discovery

& get Marv stickers!

25.04.2025 07:45 β€” πŸ‘ 63    πŸ” 19    πŸ’¬ 2    πŸ“Œ 4
IQ-TREE 3: Phylogenomic Inference Software using Complex Evolutionary Models

Not really my announcement to make--I am but a lesser co-author--but IQ-TREE 3 has just been released!

(Most credit to Minh Bui and @roblanfear.bsky.social and their labs)

ecoevorxiv.org/repository/v...

10.04.2025 14:13 β€” πŸ‘ 179    πŸ” 96    πŸ’¬ 2    πŸ“Œ 6
Post image

πŸš€ #AlphaFold Database update

AlphaFold DB now integrates The Encyclopedia of Domains (TED) – a resource designed to systematically identify & classify structural domains within AlphaFold-predicted protein structures.

www.ebi.ac.uk/about/news/u...

@pdbeurope.bsky.social

03.03.2025 16:33 β€” πŸ‘ 118    πŸ” 44    πŸ’¬ 1    πŸ“Œ 2

The PAN-GO paper is a remarkable milestone. It not only provides the most comprehensive picture of human gene function to date, but also carefully maps this knowledge across the tree of life! Congratulations @marcfeuermann.bsky.social, Pascale Gaudet & collaborators!

www.sib.swiss/news/sib-hel...

26.02.2025 22:37 β€” πŸ‘ 16    πŸ” 12    πŸ’¬ 0    πŸ“Œ 0
Post image

In our latest review, we explore 12 deep-learning tools for metagenomic analysis, covering their strengths, limitations, and key applications. We hope it serves as both a resource and inspiration for new ways to analyze metagenomic data. Great work by Eli Levy Karin!
πŸ“„ doi.org/10.1093/nsr/...

22.02.2025 05:47 β€” πŸ‘ 106    πŸ” 44    πŸ’¬ 2    πŸ“Œ 1
FastOMA retains OMA’s high precision accuracy and even improves upon it in terms of recall, positioning it on the Pareto frontier of orthology inference methods. 
FastOMA is not only fast but also accurate. a, QfO benchmar, agreement with SwissTree reference phylogeny covering manually curated gene trees. The error bars indicate 95% confidence intervals comparing FastOMA with EnsemblCompara, Domainoid, OrthoMCL, Ortholnspector, sonicparanoid, PANTHER, OrthoFinder, Hieranoid26 and the OMA family including OMA pairs, OMA groups and OMA GETHOGs (graph-based efficient technique for HOGs).

c) A computation time comparison of FastOMA and state-of-the-art alternatives.
https://www.nature.com/articles/s41592-024-02552-8

FastOMA retains OMA’s high precision accuracy and even improves upon it in terms of recall, positioning it on the Pareto frontier of orthology inference methods. FastOMA is not only fast but also accurate. a, QfO benchmar, agreement with SwissTree reference phylogeny covering manually curated gene trees. The error bars indicate 95% confidence intervals comparing FastOMA with EnsemblCompara, Domainoid, OrthoMCL, Ortholnspector, sonicparanoid, PANTHER, OrthoFinder, Hieranoid26 and the OMA family including OMA pairs, OMA groups and OMA GETHOGs (graph-based efficient technique for HOGs). c) A computation time comparison of FastOMA and state-of-the-art alternatives. https://www.nature.com/articles/s41592-024-02552-8

FastOMA is out now in Nature Methods πŸŽ‰: nature.com/articles/s41592-024-02552-8 A new orthology inference algorithm that scales linearly and is highly accurate. FastOMA can process all >2000 eukaryotic UniProt ref proteomes <24 hours πŸš€. Try it out github.com/DessimozLab/fastoma @dessimoz.bsky.social

03.01.2025 14:14 β€” πŸ‘ 35    πŸ” 18    πŸ’¬ 1    πŸ“Œ 0
Post image Post image

Unicore identifies single-copy protein structures across genomes using Foldseek, bypassing slow structure predictions by utilizing 3Di predictions from ProstT5, enabling rapid phylogenetic inference at the tree-of-life scale. 1/n
πŸ“„ www.biorxiv.org/content/10.1...
πŸ’Ύ github.com/steineggerla...

23.12.2024 16:39 β€” πŸ‘ 121    πŸ” 57    πŸ’¬ 2    πŸ“Œ 3

Unicore enables scalable and accurate phylogenetic reconstruction with structural core genes https://www.biorxiv.org/content/10.1101/2024.12.22.629535v1

23.12.2024 03:51 β€” πŸ‘ 5    πŸ” 3    πŸ’¬ 0    πŸ“Œ 0

Scientists, academics, researchers: We’re excited to share that @altmetric.com is now tracking mentions of your research on Bluesky! πŸ§ͺ

03.12.2024 14:10 β€” πŸ‘ 29990    πŸ” 5096    πŸ’¬ 466    πŸ“Œ 281
Post image

South Korean citizens helped lawmakers scale the National Assembly walls so they could bypass military barricades and vote against martial law.

03.12.2024 17:15 β€” πŸ‘ 13733    πŸ” 3178    πŸ’¬ 82    πŸ“Œ 431
bioRxiv expands on Mastodon and Bluesky bioRxiv - the preprint server for biology, operated by Cold Spring Harbor Laboratory, a research and educational institution

Reminder for newcomers that bioRxiv has Bluesky accounts in every subject category - great way to keep up (please re-skeet) connect.biorxiv.org/news/2023/09...

10.11.2024 13:40 β€” πŸ‘ 363    πŸ” 289    πŸ’¬ 6    πŸ“Œ 6

Interested in bioinformatics method development for proteins, structures or metagenomic analysis? Please check out my lab’s starter pack!
πŸ”— go.bsky.app/VJhXcSs

28.11.2024 12:36 β€” πŸ‘ 56    πŸ” 11    πŸ’¬ 3    πŸ“Œ 0
Promotional logo for MMseqs2 16 with the MMseqs2 Rocket mascot as a smart phone like App logo

Promotional logo for MMseqs2 16 with the MMseqs2 Rocket mascot as a smart phone like App logo

MMseqs2 Release 16 Highlights: GPU-accelerated searchπŸ“„, ORF or new 6-frame translated search modes, contig taxonomy always keeps the longest ORF, bug fixes (reduced memory and higher sensitivity) and relicensed as MIT
πŸ“„ biorxiv.org/content/10.1...
πŸ’Ύ mmseqs.com and 🐍Bioconda πŸ–₯️🧬🧢

27.11.2024 09:08 β€” πŸ‘ 116    πŸ” 44    πŸ’¬ 0    πŸ“Œ 1
Post image

What did the Last Eukaryotic Common Ancestor (#LECA) look like? Consensus View in #PLOSBiology; massive authorship including @AncestralState, @lauraeme.bsky.social, John Archbald, @andrewjroger.bsky.social, @dackslabecb.bsky.social, Jeremy Wideman. plos.io/4g0alq4

25.11.2024 19:29 β€” πŸ‘ 255    πŸ” 101    πŸ’¬ 6    πŸ“Œ 14
Post image

Our Big Fantastic Virus Database (BFVD) is now published NAR! It contains protein structure predictions of major viral clades, enhanced by petabase-scale homology search and it's explorable on the web.
🌐 bfvd.foldseek.com
πŸ’Ύ bfvd.steineggerlab.workers.dev
πŸ“„ academic.oup.com/nar/advance-...

23.11.2024 21:12 β€” πŸ‘ 339    πŸ” 127    πŸ’¬ 6    πŸ“Œ 5

@dongwookkim is following 20 prominent accounts