Ryan Wick's Avatar

Ryan Wick

@rrwick.bsky.social

Bioinformatician at the Centre for Pathogen Genomics at the University of Melbourne

1,130 Followers  |  122 Following  |  78 Posts  |  Joined: 15.11.2024  |  2.2303

Latest posts by rrwick.bsky.social on Bluesky

And since both metaMDBG and Myloasm had new versions after the paper was accepted, here's a blog post with an updated benchmark:
rrwick.github.io/2025/09/23/a...
(3/3)

29.09.2025 04:11 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Here's the ultra-short version:
If you want the best possible long-read bacterial genome assemblies, Autocycler is the tool for you! It is computationally intensive (due to the need to generate many alternative input assemblies) but consistently more accurate than other methods.
(2/3)

29.09.2025 04:11 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
Autocycler: long-read consensus assembly for bacterial genomes AbstractMotivation. Long-read sequencing enables complete bacterial genome assemblies, but individual assemblers are imperfect and often produce sequence-l

Happy to share that the paper describing Autocycler is now 100% up:
doi.org/10.1093/bioi...
(1/3)

29.09.2025 04:11 β€” πŸ‘ 66    πŸ” 36    πŸ’¬ 1    πŸ“Œ 0

One caveat though: my experience is with isolates that (usually) have a single 'true' sequence to aim for. In a metagenome with natural variation (i.e. a mixture of multiple 'true' sequences), I'm not sure how Dorado/Medaka would behave...

24.09.2025 00:59 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

I have limited experience with long-read metagenome assembly, so I don't have any special insight here. But I like the examples shown @floriantrigodet.bsky.social's preprint - it shows how sometimes one strange read (e.g. a chimera) can throw off the assembly.

23.09.2025 21:37 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

No, the assemblies are not Dorado/Medaka polished. I predict doing so would significantly reduce error rates, especially for the higher-error assemblies - Dorado/Medaka is quite good at fixing small-to-medium scale errors. But for lower-error assemblies (e.g. Autocycler), it may not change much.

23.09.2025 21:36 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

I agree, most real-world MAGs probably have more errors than isolates, especially if low depth. Also, metagenomes may have within-species variation, and then the ideal MAG is (arguably) some sort of consensus. Especially if there is structural variation, this can be a BIG challenge for assemblers.

23.09.2025 06:09 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Boxplots showing the benchmarking results from my blog post: old-vs-new versions of metaMDBG and Myloasm

Boxplots showing the benchmarking results from my blog post: old-vs-new versions of metaMDBG and Myloasm

23.09.2025 01:53 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Benchmark update: metaMDBG and Myloasm a blog for miscellaneous bioinformatics stuff

New blog post!

metaMDBG (@gaetanbenoit.bsky.social) and Myloasm (@jimshaw.bsky.social) have had recent releases, so I updated the benchmarks from the Autocycler paper:
rrwick.github.io/2025/09/23/a...

Both tools improved considerably! Time to update your conda environments πŸ˜„

23.09.2025 01:53 β€” πŸ‘ 35    πŸ” 26    πŸ’¬ 4    πŸ“Œ 0
Preview
agtools: a software framework to manipulate assembly graphs Assembly graphs are a fundamental data structure used by genome and metagenome assemblers to represent sequences and their overlap information, facilitating the assembler to construct longer genomic f...

Excited to share our latest preprint on agtools, an open-source Python framework for analysing and manipulating assembly graphs. (1/n)

www.biorxiv.org/content/10.1...

#Bioinformatics #genomics #assembly #assemblygraphs #software

17.09.2025 06:58 β€” πŸ‘ 29    πŸ” 16    πŸ’¬ 2    πŸ“Œ 2

Preprint out for myloasm, our new nanopore / HiFi metagenome assembler!

Nanopore's getting accurate, but

1. Can this lead to better metagenome assemblies?
2. How, algorithmically, to leverage them?

with co-author Max Marin @mgmarin.bsky.social, supervised by Heng Li @lh3lh3.bsky.social

1 / N

07.09.2025 23:34 β€” πŸ‘ 114    πŸ” 79    πŸ’¬ 5    πŸ“Œ 5

Amazing, thanks! Will keep an eye out for this preprint.

07.09.2025 21:03 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

I haven't yet, but that is absolutely something I should try. I should also more generally educate myself on best practices in telomere assembly, e.g. with that paper Adam linked. This is new to me!

07.09.2025 21:02 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Thanks for this - looks interesting!
In the paper you said: "...Illumina sequencing can generate spurious indels within HTs, especially for HT lengths longer than 14 bp." Do you have a sense of how bad this gets for really long homopolymers, e.g. 20+ bp?

05.09.2025 01:22 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

I'm sure there are more robust ways to go about telomere assembly - I'm not very experienced with T2T eukaryote genome assemblies 😬

05.09.2025 01:10 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

And my manual telomere fixing was makeshift. I pieced together a few assemblies (mostly from Flye) that extended all the way to the telomeres. And then I manually repaired the telomeres to be exact 6-mer repeats - i.e. I assumed any deviation from the 6-mer was ONT error not real biology.

05.09.2025 01:10 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 3    πŸ“Œ 0

Since all Illumina sequencing involves some PCR (bridge amplification), I also wonder at what length Illumina reads start to fail with homopolymers. Can they reliably sequence 20-mers? 40-mers? 60-mers? It's a hard question to answer if every sequencing tech struggles with these...

05.09.2025 01:10 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 3    πŸ“Œ 0

I too am wary. I suppose the hope is that by limiting changes to long homopolymers, polishing will fix more errors than it introduces. I.e. I'm guessing that ONT's long-homopolymer error rate is greater than cross-sample homopolymer differences. But this is very much unproven!

05.09.2025 01:10 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Cross-sample homopolymer polishing with Pypolca a blog for miscellaneous bioinformatics stuff

New blog post!

I added a new feature to @gbouras13.bsky.social's Pypolca: homopolymer-only polishing. Potentially useful for cross-sample polishing - early test on Cryptosporidium looks promising.

Check it out here:
rrwick.github.io/2025/09/04/h...

04.09.2025 06:37 β€” πŸ‘ 18    πŸ” 7    πŸ’¬ 2    πŸ“Œ 1
Post image

Pleased to say that our preprint benchmarking Nanopore data for MLST, cgMLST, cgSNP & AMR typing from bacterial isolates is out! TL;DR you can get almost perfect results from 50x depth using live SUP basecalling with a GPU in under 20 hours #microsky#IDsky 🦠🧬πŸ–₯️ /1
www.medrxiv.org/content/10.1...

30.07.2025 02:10 β€” πŸ‘ 46    πŸ” 29    πŸ’¬ 3    πŸ“Œ 2

However, I hear rumours that ONT might be working on a new move-table-aware bacterial polishing model. See my blog post from Feb for details: rrwick.github.io/2025/02/07/d.... If true, I'll be eager to test it out when released.

10.06.2025 02:37 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Release v1.0.1 Β· nanoporetech/dorado [1.0.1] (4 June 2025) This release introduces support in the --bacteria mode of Dorado polish for data basecalled with v5.2 models and improves the speed of 5mCG_5hmCG calling with v5.0 and v5.2 mo...

Minor new release of @nanoporetech.com's Dorado:
github.com/nanoporetech...

It supports using --bacteria when polishing an assembly with v5.2.0 data, which is nice! But if I understand correctly, it's the same bacterial polishing model from Sep 2024, not a new model.

10.06.2025 02:37 β€” πŸ‘ 8    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0

Good to know - thanks for clarifying. Makes sense for a tool that's designed to work with big metagenomic datasets. I'm using it a bit out of its domain on a bacterial isolate.

30.05.2025 05:43 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

The read set was only 240 MB (gzipped), so it is memory hungry. The myloasm docs do acknowledge that it uses more memory than other assemblers.

Also, I ran my tests on an ARM Mac, but the docs suggest that myloasm (specifically the polishing step) will be even faster on x86-64 CPUs with AVX2.

30.05.2025 02:43 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Just ran a few more tests through GNU time:
1 thread: 435 seconds, 10.1 GB RAM
2 threads: 238 seconds, 10.0 GB RAM
4 threads: 133 seconds, 10.1 GB RAM
8 threads: 73 seconds, 10.1 GB RAM
16 threads: 49 seconds, 13.3 GB RAM

30.05.2025 02:42 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 3    πŸ“Œ 0

I tested myloasm on a 50x Klebsiella isolate, and it was very fast - only took about 1 minute to complete (on my Macbook).

29.05.2025 05:01 β€” πŸ‘ 15    πŸ” 4    πŸ’¬ 2    πŸ“Œ 0

Even though it's designed for metagenomes, I suspect it might work nicely on bacterial isolates as well, especially if you filter out low-depth contigs. This is what I've found for metaMDBG, a long-read metagenome assembler from @gaetanbenoit.bsky.social.

29.05.2025 05:01 β€” πŸ‘ 6    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0

A new long-read metagenome assembler has been released: myloasm. Very exciting! Looking forward to trying it out.

29.05.2025 05:01 β€” πŸ‘ 25    πŸ” 5    πŸ’¬ 1    πŸ“Œ 0
myloasm - metagenomic assembly with (noisy) long reads

Announcing myloasm, a new long-read (ONT R10/PacBio) metagenome assembler that I've been working on during my postdoc in the Heng Li lab (@lh3lh3.bsky.social).

myloasm-docs.github.io

28.05.2025 17:53 β€” πŸ‘ 132    πŸ” 78    πŸ’¬ 5    πŸ“Œ 3

I have 🀞

28.05.2025 21:28 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

@rrwick is following 20 prominent accounts