π§΅6/ 6
Since MSRs sketches are sequence, they are super easy to use. I think they could be useful for many other problems, e.g. SNP calling, pangenome graphs, indexing, etc.
@rfaure.bsky.social
Sequence bioinfomatician, algorithms, methods. Postdoc in Institut Pasteur in Rayan Chikhi's lab
π§΅6/ 6
Since MSRs sketches are sequence, they are super easy to use. I think they could be useful for many other problems, e.g. SNP calling, pangenome graphs, indexing, etc.
π§΅5/6
The sketching makes assembly extremely fast: a gut metagenome sample of 138Gbp of sequencing data was assembled in less that 2h and 10G RAM on 8 threads β‘. And thanks to MSRs, *highly similar strains are not collapsed*
π§΅4/6
Two key properties that make MSRs sketches really cool:
π They are alignable sequences: you can just feed them in existing assembler
π MSR sketches can *keep all the SNPs*, i.e. two highly similar sequences are (almost) always reduced to different sketches -> useful to separate similar strains
π§΅3/ 6
MSRs have been defined by @lblassel.bsky.social @rayanchikhi.bsky.social and @pashadag.bsky.social in pmc.ncbi.nlm.nih.gov/articles/PMC....
Take a sequence, a value of k, and stream all k-mers through a function that output either a base or the empty character, and you got your sketch
π§΅2/6
Conceptually, the assembler is on the same lines as metaMDBG:
1. sketching reads
2. assembly procedure on the sketches
3. reversing to base-space to obtain the final assembly
The main difference is the sketching scheme: we introduce *Mapping-friendly Sequence Reductions (MSR) sketching*
Our preprint on our new metagenomic HiFi assembler Alice is out π₯³ Based on a *new sketching method* (π§΅1/6)
π Preprint www.biorxiv.org/content/10.1...
π Github github.com/rolandfaure/...
ππ©βπ¬ For 15+ years biology has accumulated petabytes (million gigabytes) ofπ§¬DNA sequencing data𧬠from the far reaches of our planet.π¦ ππ΅
Logan now democratizes efficient access to the worldβs most comprehensive genetics dataset. Free and open.
doi.org/10.1101/2024...
Preprint out for myloasm, our new nanopore / HiFi metagenome assembler!
Nanopore's getting accurate, but
1. Can this lead to better metagenome assemblies?
2. How, algorithmically, to leverage them?
with co-author Max Marin @mgmarin.bsky.social, supervised by Heng Li @lh3lh3.bsky.social
1 / N
Congrats! Nice results π
16.05.2025 14:08 β π 0 π 0 π¬ 1 π 0I am happy to share our new preprint introducing MADRe - a pipeline for Metagenomic Assembly-Driven Database Reduction, enabling accurate and computationally efficient strain-level metagenomic classification.
πhttps://www.biorxiv.org/content/10.1101/2025.05.12.653324v1
1/9
Starting #RECOMBseq with @rayanchikhi.bsky.social 's keynote. Here stressing our responsibility as scientists to enable access to a common good: genomic data
24.04.2025 00:42 β π 30 π 10 π¬ 1 π 1Side note: you could, speaking purely theoretically, also fit every microbe onto an SD card, which is within the weight limit for a carrier pigeon. For some distances, it would be faster than the internet for transmitting sequence libraries
7/
So glad this is finally out. The method has been instrumental in allowing us to compress the AllTheBacteria data - ~2 million bacterial genomes shrink from 3Terabytes (gzipped) to 100Gb using phylogenetic compression. Great work by @brinda.eu
09.04.2025 22:27 β π 126 π 51 π¬ 4 π 1Do you (like me) create a bunch of conda environments, then later forget what they're for, when they were last updated, or which tools are in them?
If so, you might this little project: github.com/rrwick/conda...
So glad to have participated in #DSB2025, what a great workshop! For some mysterious reason it was the first time I attended after 3 years of sequence research. Thanks to all participants & organizers π
07.03.2025 19:41 β π 2 π 0 π¬ 1 π 0Ragnar's made some incredible optimizations on the computation of minimizers, can't wait to see how these improvements will benefit bioinfo tools!
13.12.2024 15:50 β π 4 π 2 π¬ 0 π 0Really cool work!
13.12.2024 14:42 β π 1 π 0 π¬ 0 π 0Toy example of the AAAAAAA bucket associated to four super-k-mer turned into their interleaved representation.
Amazing ideas here www.biorxiv.org/content/bior... from
@yoann.bsky.social
and collaborators.
Reorganize minimizers to allow kmers dichotomic search. That's brilliant.
#bioinformatics π§¬π₯οΈ
So glad to have successfully defended my Ph.D. last week π Work on producing haplotype-resolved metagenomic assemblies using noisy long reads (HairSplitter) and high-fidelity long reads (Alice assembler, unpublished yet).
Thanks to my advisors Dominique Lavenier and Jean-FranΓ§ois Flot β€οΈ
Congrats @firtinac.bsky.social ! I enjoyed thouroughly reading the BLEND paper π
02.12.2024 14:11 β π 1 π 0 π¬ 1 π 0