Jaebeom Kim's Avatar

Jaebeom Kim

@jbeom.bsky.social

Developing bioinformatics software in Steinegger lab. at Seoul National Univ.

342 Followers  |  186 Following  |  19 Posts  |  Joined: 30.08.2023  |  2.2815

Latest posts by jbeom.bsky.social on Bluesky


Preview
A new gateway to global antimicrobial resistance data New online portal connects bacterial genomes with experimental resistance data to support antimicrobial resistance research.

Antimicrobial resistance (AMR) is a growing health threat, making infections harder to treat and complicating routine medical care.

EMBL-EBI’s new AMR portal brings together laboratory resistance data and bacterial genomes in one open platform.

#WAAW2025 #ActOnAMR

www.ebi.ac.uk/about/news/t...
πŸ§¬πŸ’»

18.11.2025 09:59 β€” πŸ‘ 35    πŸ” 18    πŸ’¬ 1    πŸ“Œ 2

Huge congratulations and thanks to
@sunjaelee.bsky.social, @milot.bsky.social,
@cameron.gilchrist, and @martinsteinegger.bsky.social πŸ‘

16.10.2025 07:29 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

Curate your database. 🧡4/5
- GTDB, NCBI, ICTV, or custom taxonomy supported.
- Add genomes to a pre-built DB to save time (benchmark in figure)
- Expand taxonomy. e.g., integrate ICTV viruses into a GTDB prokaryote DB.
⏱️Building a DB of 8,520 GTDB species took 106 min on a MacBook M2 Pro (32G)

16.10.2025 07:29 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Video thumbnail

Explore results with interactive visualization. 🧡3/5
- Generate customized Sankey plots.
- Search for taxa of interest.
- Filter by classified reads or proportion.
- Click taxon nodes for subtree views.
- Extract reads classified to a taxon.
- Access NCBI Taxonomy and genome browser.

16.10.2025 07:29 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

The app runs desktop-optimized Metabuli. 🧡2/5
Classifying 2X22M human gut reads vs. 36K genomes over 8,465 species took:
πŸ–₯️46 min on a Windows desktop (i9-9900, 32GB RAM)
πŸ’»39 min on a MacBook M2 Pro (32GB RAM)

16.10.2025 07:29 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Video thumbnail

Easy and interactive taxonomic profiling with Metabuli App.
It integrates database curation, read QC, taxonomic profiling, and visualization right on your desktop.
No command line, server, or internet required.
Now published in Bioinformatics! 🧡1/5
doi.org/10.1093/bioi...
github.com/steineggerla...

16.10.2025 07:29 β€” πŸ‘ 24    πŸ” 11    πŸ’¬ 1    πŸ“Œ 1
Illustration of Burrows-Wheeler Transform and many auxiliary structures from the input string how$now$brown$cow$#

Illustration of Burrows-Wheeler Transform and many auxiliary structures from the input string how$now$brown$cow$#

New tool "bwt-svg" for making illustrations of the BWT and the many auxiliary arrays and other structures related to it. Pyodide-based no-installation-necessary interface here: benlangmead.github.io/bwt-svg/. (H/t to @robert.bio for pointing me to pyodide!) Full repo: github.com/benlangmead/....

14.10.2025 20:48 β€” πŸ‘ 40    πŸ” 21    πŸ’¬ 4    πŸ“Œ 1

Finally we got an end-to-end structural annotation tool for phages!

08.08.2025 08:26 β€” πŸ‘ 11    πŸ” 1    πŸ’¬ 0    πŸ“Œ 1
Preview
Protein Structure Informed Bacteriophage Genome Annotation with Phold Bacteriophage (phage) genome annotation is essential for understanding their functional potential and suitability for use as therapeutic agents. Here we introduce Phold, an annotation framework utilis...

Stoked to finally have a preprint out for Phold, our tool that uses protein structural information to enhance phage genome annotation #phagesky 1/n

www.biorxiv.org/content/10.1...

08.08.2025 07:10 β€” πŸ‘ 137    πŸ” 66    πŸ’¬ 5    πŸ“Œ 4
Preview
The Em Dash Responds to the AI Allegations β€œIn recent months, a curious fixation has emerged in corners of academia: the em dash. More specifically, the apparent moral panic around how it is...

"Writers have been using me long before the advent of AI. I am the punctuation equivalent of a cardiganβ€”beloved by MFA grads, used by editors when it’s actually cold, and worn year-round by screenwriters. I am not new here."

17.07.2025 18:20 β€” πŸ‘ 417    πŸ” 171    πŸ’¬ 10    πŸ“Œ 26
Preview
Call for applications (professor or associate professor), Department of Computational Biology and Medical Sciences (Deadline Sep 30) |Job Opportunities|Information|Graduate School of Frontier Scienc... Call for applications (professor or associate professor), Department of Computational Biology and Medical Sciences (Deadline Sep 30) |Job Opportunities|Information|GSFS offers both masterβ€˜s and doct...

My colleague asked me to circulate this job posting for Professor / Associate Professor in Computational Biology / Genomics at the University of Tokyo (P.S. I'm not affiliated):

www.k.u-tokyo.ac.jp/en/informati...

15.07.2025 03:24 β€” πŸ‘ 27    πŸ” 25    πŸ’¬ 0    πŸ“Œ 1
Video thumbnail

Folddisco finds similar (dis)continuous 3D motifs in large protein structure databases. Its efficient index enables fast uncharacterized active site annotation, protein conformational state analysis and PPI interface comparison. 1/9🧢🧬
πŸ“„ www.biorxiv.org/content/10.1...
🌐 search.foldseek.com/folddisco

07.07.2025 08:21 β€” πŸ‘ 152    πŸ” 70    πŸ’¬ 8    πŸ“Œ 3
Post image

Preprint on "Improving spliced alignment by modeling splice sites with deep learning". It describes minisplice for modeling splice signals. Minimap2 and miniprot now optionally use the predicted scores to improve spliced alignment.
arxiv.org/abs/2506.12986

17.06.2025 01:48 β€” πŸ‘ 112    πŸ” 54    πŸ’¬ 0    πŸ“Œ 2
Post image

Unicore is now published on GBE πŸš€
Unicore rapidly identifies structural single-copy core genes from input species proteomes for phylogenetic analysis. Powered by Foldseek and ProstT5, Unicore enables linear-scale structure-based phylogeny of any given set of taxa. 🧡1/n
πŸ“ƒ doi.org/10.1093/gbe/evaf109

03.06.2025 06:54 β€” πŸ‘ 68    πŸ” 31    πŸ’¬ 3    πŸ“Œ 2
Post image

Introducing our invited speaker for the session on 'Viral Dark Matter' we have Rachel Seongeun Kim from the Seoul National University!!!!

The registrations for on-site & remote participation are still open! More info: RdRp.io
#RdRpSummit2025

02.05.2025 14:55 β€” πŸ‘ 26    πŸ” 12    πŸ’¬ 1    πŸ“Œ 0

I'm presenting a poster about Metabuli, a metagenomic taxonomic classifier leveraging both DNA and protein sequences, at #RECOMB2025! Please come and share yout thoughts!

25.04.2025 07:52 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

Visit our posters at #RECOMB2025 for:

Structural: MSAs, Virus DB, Core Genes, Motif Discovery, Multimer Clustering & Search, pLM Foldseek, Environmental analysis

Metagenomics: Classification & Metabuli App

GPU-based & RNA search, Proteome clustering, Novel Ribozyme discovery

& get Marv stickers!

25.04.2025 07:45 β€” πŸ‘ 64    πŸ” 19    πŸ’¬ 2    πŸ“Œ 4
Post image

@eunbelivable.bsky.social presented our viral protein structure database BFVD, including the new V2 update with improved predictions using 12 recycles for higher quality structures. Check out the paper and data here:
πŸ“„ academic.oup.com/nar/article/...
🌐 bfvd.foldseek.com
#RECOMB2025

25.04.2025 05:02 β€” πŸ‘ 22    πŸ” 8    πŸ’¬ 1    πŸ“Œ 0
Preview
RECOMB2025 Things to do - Google My Maps RECOMB2025 Things to do

I also updated the main #RECOMB2025 things-to-do map to include more tourist attractions and some of the standout vegan places I have visited myself (except the two places next to Yonsei, which I didn't have a chance to visit yet):

www.google.com/maps/d/edit?...

21.04.2025 13:48 β€” πŸ‘ 7    πŸ” 2    πŸ’¬ 1    πŸ“Œ 0
Preview
RECOMB 2025 - Things to do RECOMB 2025 - Seoul, South Korea

Finding vegetarian and vegan food in Korea can be tricky. I added a mini-guide to the #RECOMB2025 things-to-do site with some resources. Let me know if you want more recommendations!

recomb.org/recomb2025/t...

21.04.2025 04:42 β€” πŸ‘ 17    πŸ” 2    πŸ’¬ 2    πŸ“Œ 0
Preview
SimdMinimizers: Computing random minimizers, fast Motivation Because of the rapidly-growing amount of sequencing data, computing sketches of large textual datasets has become an essential preprocessing task. These sketches are typically much smaller ...

Congratulations to @imartayan.bsky.social and @curiouscoding.nl whose paper on fast minimizer computation with simd has been accepted to SEA 2025 πŸ™ŒπŸ» www.biorxiv.org/content/10.1...

01.04.2025 08:23 β€” πŸ‘ 17    πŸ” 10    πŸ’¬ 0    πŸ“Œ 1
Post image

Big Fantastic Virus Database (BFVD) version 2 improves 31% of predictions through 12 ColabFold recycles. PAEs and MSAs now also available for download and in the webserver.
🌐https://bfvd.foldseek.com
πŸ’Ύhttps://bfvd.steineggerlab.workers.dev/
1/3

31.03.2025 05:07 β€” πŸ‘ 66    πŸ” 24    πŸ’¬ 2    πŸ“Œ 2

μ£½κ³  μ‹Άμ§€λ§Œ λ–‘λ½‚μ΄λŠ” λ¨Ήκ³ μ‹Άμ–΄!

29.03.2025 12:26 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Metabuli Databases

πŸš€ New Metabuli DB is out!
It includes quality-filtered GTDB R220, RefSeq viruses, and the human T2T.
Prokaryotes follow the GTDB taxonomy; viruses and the human genome use NCBI taxonomy.
Download it and expand with genomes of your choice using "updateDB" command.
metabuli.steineggerlab.workers.dev

27.03.2025 07:43 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Schematic of the mod-bucket algorithm: all k-mer hashes are partitioned into s buckets via their remainder mod s. Then, in each bucket the smallest hash is selected.

Schematic of the mod-bucket algorithm: all k-mer hashes are partitioned into s buckets via their remainder mod s. Then, in each bucket the smallest hash is selected.

Just published simd-sketch, a crate for fast bucket sketches.
It's 7x to 30x faster than BinDash, by using the simd-minimizers crate for fast hashing, and a nearly branch-free implementation.

Here's a blogpost with a survey of minhash history & methods, and evals:

curiouscoding.nl/posts/simd-s...

14.03.2025 00:35 β€” πŸ‘ 12    πŸ” 9    πŸ’¬ 1    πŸ“Œ 0

Metabuli App is a team effort! Sunny led the interface design and implementation, while Milot and Martin provided expert guidance throughout the project! @sunjaelee.bsky.social @milot.bsky.social @martinsteinegger.bsky.social Huge thanks and congrats everyoneπŸŽ‰

13.03.2025 08:50 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

⚑Metabuli is now even faster on Linux servers and Windows/macOS computers.
Annotating 22M paired-end human gut reads vs. 36K genomes (8,465 GTDB species) took:
πŸ–₯️57 min on a Windows desktop (i9-9900, 32GB RAM)
πŸ’»39 min on a MacBook M2 Pro (32GB RAM) 🧡4/5

13.03.2025 08:50 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

πŸ› οΈCurate your database.
- Build DBs with GTDB, NCBI, ICTV, or custom taxonomy.
- Add genomes to pre-built DBs to save time (benchmark in figure)
- Expand taxonomyβ€”e.g., integrate ICTV viruses into a GTDB prokaryote DB.
⏱️Building a DB of 8520 GTDB species took 104 min on a MacBook M2 Pro (32G). 🧡3/5

13.03.2025 08:50 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Video thumbnail

πŸ”Explore results with interactive visualization.
- Generate customized Sankey plots.
- Search for taxa of interest.
- Filter by classified reads or proportion.
- Click taxon nodes for subtree views.
- Extract reads classified to a taxon.
- Access NCBI Taxonomy and genome browser. 🧡2/5

13.03.2025 08:50 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Video thumbnail

Metabuli App preprint is out!
πŸ’»Taxonomic classification & interactive visualizationβ€”right on your laptop
πŸ› οΈCreate new databases or update existing ones with new sequences. 🧡1/5
github.com/steineggerla...
www.biorxiv.org/content/10.1101/2025.03.10.642298v1

13.03.2025 08:50 β€” πŸ‘ 20    πŸ” 16    πŸ’¬ 1    πŸ“Œ 1

@jbeom is following 19 prominent accounts