Hi bioinformatics, genomics and CS friends! Please help me spread the word. I'm hiring a postdoc! Come work on cutting edge method development in algorithmic genomics with me and my group at @umdscience.bsky.social! ๐ฅ๏ธ๐งฌ
10.10.2025 13:02 โ ๐ 29 ๐ 37 ๐ฌ 0 ๐ 3
Thanks Rob! Much appreciated.
09.10.2025 15:06 โ ๐ 1 ๐ 0 ๐ฌ 0 ๐ 0
MetaGraph - Biological Sequence Search
Petabase-Scale Search for DNA, RNA & Amino acids
We invite you to try out Metagraph at metagraph.ethz.ch, learn more about our framework in the paper (nature.com/articles/s41...) or start building your own indexes from your own data (github.com/ratschlab/me...).
08.10.2025 20:56 โ ๐ 4 ๐ 0 ๐ฌ 0 ๐ 0
We would like to thank the bioinformatics community for years of support and openness. A special thanks to the Logan effort, whose contig set we use as input for one of our largest indexes.
08.10.2025 20:56 โ ๐ 4 ๐ 0 ๐ฌ 1 ๐ 0
While MetaGraph provides a lossless representation of the input k-mer set, it is not a lossless compression of the raw reads. To reach petabase scale, we remove noisy k-mers prior to indexing โ a step that we show has only minimal impact on search sensitivity.
08.10.2025 20:56 โ ๐ 3 ๐ 0 ๐ฌ 1 ๐ 0
We show that MetaGraph indexes are both scalable and cost-efficient for querying. We Searching 1 Mbp of sequence against the entire SRA costs less than $1 on standard cloud infrastructure โ making Petabase-scale biological data truly searchable and accessible.
08.10.2025 20:56 โ ๐ 5 ๐ 1 ๐ฌ 1 ๐ 0
Our indexes support fast exact matching as well as alignment with edits. Labels can represent sample metadata, coordinates or quantification values. We can store 10โ000 human transcriptome samples in < 160 GB and return position-wise expression for any queried sequence.
08.10.2025 20:56 โ ๐ 5 ๐ 1 ๐ฌ 1 ๐ 0
We have already processed more than 10 Petabases of raw sequence data from the SRA and make the compressed indexes publicly available for search (metagraph.ethz.ch), download and cloud-based access.
08.10.2025 20:56 โ ๐ 4 ๐ 1 ๐ฌ 1 ๐ 0
At its core, MetaGraph represents all input sequences as labeled, succinct de Bruijn graphs โ a highly compressed yet fully searchable structure. Each k-mer carries metadata labels that remain interactively queryable through a flexible API.
08.10.2025 20:56 โ ๐ 4 ๐ 0 ๐ฌ 1 ๐ 0
Modern biology produces vast amounts of raw sequencing data โ genomes, transcriptomes, and protein sequences. MetaGraph provides a unified computational framework to index, query, and reason across this landscape of biological information.
08.10.2025 20:56 โ ๐ 5 ๐ 1 ๐ฌ 1 ๐ 0
The following thread describes the main ideas and results of this joint work with @gxxxr.bsky.social @karasikov.bsky.social @adamant-pwn.bsky.social @HarunMustafa416
08.10.2025 20:56 โ ๐ 3 ๐ 0 ๐ฌ 1 ๐ 0
Efficient and accurate search in petabase-scale sequence repositories - Nature
MetaGraph enables scalable indexing of large sets of DNA, RNA or protein sequences using annotated de Bruijn graphs.
After years of research and continuous refinement, weโre thrilled to share that our paper on the MetaGraph framework โ enabling Petabase-scale search across sequencing data โ has been published today in Nature (www.nature.com/articles/s41...)
08.10.2025 20:56 โ ๐ 28 ๐ 17 ๐ฌ 3 ๐ 2
Co-founder at https://omgenomics.com. Creator of https://42basepairs.com, https://sandbox.bio, https://biowasm.com, https://levelupwasm.com. Bioinformatics, genomics.
About me: https://robert.bio
Computational biologist @HelmholtzMunich, prof @TU_Muenchen & associate PI @sangerinstitute. Dad of 4 and mountain lover. Department news, see @CompHealthMuc
Rustlang and Rustlang accessories
* YouTube: https://www.youtube.com/@chrisbiscardi
* Learn Rust: https://www.rustadventure.dev/
* Rust Discord: https://discord.gg/GJ5UfxzUcP
* Party Corgi Content Discord: https://discord.gg/partycorgi
Bioinformatics. Rust. Mojo.
Rust dev, Electronics engineer, Author, Rust library team lead, ADHD, Polyamorous, Lesbian, She/Her
Professor of Algorithmic and Microbial Genomics at the University of Bath (UK). Pangenomes, drug resistance (esp TB), data structures for DNA search, plasmid evolution, global microbial surveillance. Open Data, reproducibility
Executive Director EMBL. I have an insatiable love of biology. Consultant to ONT and Cantata (Dovetail)
Professor @hi.is in CS, head of rna seq data analysis at decode genetics (views are mine). Bioinformatician, epistemic trespasser, &c.
ps. I hate GTF files
PhD on high troughput bioinformatics @ ETH Zurich;
IMO, ICPC, Xoogler, Rust, road-cycling, hiking, wild camping, photography
Associate Professor of CS @ University of Maryland. Proud Rust advocate! I โฅ science & compiled, statically-typed programming languages! Views are my own. Tech stack: https://github.com/rob-p/tech-stack.
Finished a human genome, working on a few more ๐จโ๐ป
Lab: https://genomeinformatics.github.io
Posts are my own
Professor of Computer Science @ JHU. https://www.langmead-lab.org/ https://www.youtube.com/BenLangmead
computational biologist: genomes, evolution, DNA
Scientist at NHGRI/NIH. Mapping complete genomes, finished a couple, working on more ๐ฎ๐๐. Posts are my own opinion.
Bloomberg Distinguished Professor at Johns Hopkins University. http://schatz-lab.org
Genome assembly at Wellcome Sanger Institute and University of Cambridge
Professor at KTH, NY Genome Center, SciLifeLab, working on functional genomics and human genetics.
On the academic job market | How are species compared to one another across different genomic regions? Postdoc at Langmead Lab, Johns Hopkins | Comparative #genomics at scale | Formerly at UNIL/SIB/WUR | sinamajidian.github.io
Associate Professor of Biomolecular Engineering at the University of California, Santa Cruz; Associate Director, UC Santa Cruz Genomics Institute