Jim Shaw's Avatar

Jim Shaw

@jimshaw.bsky.social

Postdoc at Dana-Farber and Harvard Med with Heng Li (@lh3lh3.bsky.social). Prev: UBC / UofT. I like thinking about computational biological sequence analysis and its applications to metagenomics. https://jim-shaw-bluenote.github.io

1,115 Followers  |  460 Following  |  101 Posts  |  Joined: 20.09.2023  |  2.2936

Latest posts by jimshaw.bsky.social on Bluesky

Preview
Long-read metagenomics for strain tracking after faecal microbiota transplant Nature Microbiology - A long-read metagenomics method empowers faecal microbiota transplantation studies by precisely tracking bacteria from donors to recipients, distinguishing co-existing strains...

Excited to share our LongTrack study out in
@natmicrobiol.nature.com today!

Fecal microbiota transplant (FMT), donor πŸ’© => patients' gut, is an effective treatment for recurrent C. difficile infection & is being evaluated for Inflammatory Bowel Diseases (IBD) & other conditions 1/

πŸ“„ rdcu.be/eL8mR

22.10.2025 15:39 β€” πŸ‘ 21    πŸ” 11    πŸ’¬ 1    πŸ“Œ 0
Preview
GTDB release 10: a complete and systematic taxonomy for 715Β 230 bacterial and 17Β 245 archaeal genomes Abstract. The Genome Taxonomy Database (GTDB; https://gtdb.ecogenomic.org) provides a phylogenetically consistent and rank normalized genome-based taxonomy

Our @narjournal.bsky.social manuscript is out! It explores the growth of the GTDB (gtdb.ecogenomic.org) since its inception, as well as updates to the website, methodology, policies, and major taxonomic and nomenclatural changes over the past three years.

academic.oup.com/nar/advance-...

22.10.2025 14:20 β€” πŸ‘ 68    πŸ” 47    πŸ’¬ 0    πŸ“Œ 2
Preview
Alice: fast and haplotype-aware assembly of high-fidelity reads based on MSR sketching We introduce Mapping-friendly Sequence Reduction (MSR) sketches, a sketching method for high-fidelity (HiFi) long reads, and Alice, an assembler that operates directly on these sketches. MSR produces ...

Our preprint on our new metagenomic HiFi assembler Alice is out πŸ₯³ Based on a *new sketching method* (🧡1/6)
πŸ‘‰ Preprint www.biorxiv.org/content/10.1...
πŸ‘‰ Github github.com/rolandfaure/...

03.10.2025 14:51 β€” πŸ‘ 24    πŸ” 21    πŸ’¬ 2    πŸ“Œ 0
Preview
Megaplasmids associate with Escherichia coli and other Enterobacteriaceae Humans and animals are ubiquitously colonized by Enterobacteriaceae , a bacterial family that contains both commensals and clinically significant pathogens. Here, we report Enterobacteriaceae megaplas...

New pre-print from the Banfield lab, highlighting an interesting case of 1.5Mb megaplasmids found in human gut.

Plasmid genomes were resolved using #PacBio HiFi sequencing with hifiasm-meta for #metagenome assembly. Host association was detected using epigenetic signals.

doi.org/10.1101/2025...

01.10.2025 16:43 β€” πŸ‘ 48    πŸ” 22    πŸ’¬ 1    πŸ“Œ 2
Post image

Do you know ~60% of human SVs fall in ~1% of GRCh38? See our new preprint: arxiv.org/abs/2509.23057 and the companion blog post on how we started this project and longdust: lh3.github.io/2025/09/29/o.... Work with Alvin Qin

30.09.2025 02:19 β€” πŸ‘ 76    πŸ” 27    πŸ’¬ 0    πŸ“Œ 0

High-accuracy SNV calling for bacterial isolates using deep learning with AccuSNV https://www.biorxiv.org/content/10.1101/2025.09.26.678787v1

29.09.2025 18:47 β€” πŸ‘ 1    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0
Post image

Delighted to see our paper studying the evolution of plasmids over the last 100 years, now out! Years of work by Adrian Cazares, also Nick Thomson @sangerinstitute.bsky.social - this version much improved over the preprint. Final version should be open access, apols.
Thread 1/n

25.09.2025 21:28 β€” πŸ‘ 298    πŸ” 153    πŸ’¬ 14    πŸ“Œ 8

Super classy and much respect for updating the benchmarks Ryan. What a nice surprise. Very appreciated as a developer :).

Grats on the huge improvements @gaetanbenoit.bsky.social for metamdbg

23.09.2025 13:15 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Benchmark update: metaMDBG and Myloasm a blog for miscellaneous bioinformatics stuff

New blog post!

metaMDBG (@gaetanbenoit.bsky.social) and Myloasm (@jimshaw.bsky.social) have had recent releases, so I updated the benchmarks from the Autocycler paper:
rrwick.github.io/2025/09/23/a...

Both tools improved considerably! Time to update your conda environments πŸ˜„

23.09.2025 01:53 β€” πŸ‘ 35    πŸ” 26    πŸ’¬ 4    πŸ“Œ 0
Video thumbnail

Many of the most complex and useful functions in biology emerge at the scale of whole genomes.

Today, we share our preprint β€œGenerative design of novel bacteriophages with genome language models”, where we validate the first, functional AI-generated genomes 🧡

17.09.2025 15:03 β€” πŸ‘ 49    πŸ” 20    πŸ’¬ 3    πŸ“Œ 4

agtools: a software framework to manipulate assembly graphs https://www.biorxiv.org/content/10.1101/2025.09.14.676178v1

16.09.2025 20:48 β€” πŸ‘ 9    πŸ” 10    πŸ’¬ 0    πŸ“Œ 0

X-Mapper 🦠🧬πŸ§ͺ - a sequence aligner developed for microbes, now on Bioconda! πŸš€
β€’ 11–24Γ— fewer suboptimal alignments (same for human genome)
β€’ 3–579Γ— lower inconsistency
β€’ improves on ~30% of reads aligned to non-target species
github.com/mathjeff/map...
bioconda.github.io/recipes/x-ma...
#microsky

15.09.2025 02:32 β€” πŸ‘ 45    πŸ” 23    πŸ’¬ 3    πŸ“Œ 0
Post image

New blog post – A quick look at Roche's SBX
lh3.github.io/2025/09/11/a...

12.09.2025 03:26 β€” πŸ‘ 57    πŸ” 30    πŸ’¬ 2    πŸ“Œ 3

Great!! Let me know what you find :)

11.09.2025 12:43 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

I sincerely appreciate the opportunity to visit @ebi.embl.org (thanks to the @embl.org Sabbatical fellowship). The guidance and support I received from Zam (@zaminiqbal.bsky.social), John (@bacpop.org) and other colleagues have been immensely valuable! You changed my career!❀️

10.09.2025 09:55 β€” πŸ‘ 29    πŸ” 7    πŸ’¬ 2    πŸ“Œ 0
Preview
Efficient sequence alignment against millions of prokaryotic genomes with LexicMap - Nature Biotechnology LexicMap uses a fixed set of probes to efficiently query gene sequences for fast and low-memory alignment.

Sometimes you meet absolutely incredible bioinfo-magicians.
It was a huge privilege when @shenwei356.bsky.social
joined our group for a year on an @embl.org sabbatical.
While here, he developed a new way of aligning to
millions of bacteria, called LexicMap 1/n
www.nature.com/articles/s41...

10.09.2025 09:12 β€” πŸ‘ 188    πŸ” 99    πŸ’¬ 5    πŸ“Œ 4

Now preprinted at arxiv.org/abs/2509.07357

10.09.2025 02:10 β€” πŸ‘ 22    πŸ” 7    πŸ’¬ 0    πŸ“Œ 0
Post image

How do you long-read sequence metagenomes? I would argue it starts with the right sample storage & DNA extraction, to enable efficient @nanoporetech.com /@pacbio.bsky.social sequencing, which we investigated in our new paper: www.biorxiv.org/content/10.1...

Massive thanks to Klara for driving this

09.09.2025 15:35 β€” πŸ‘ 41    πŸ” 23    πŸ’¬ 0    πŸ“Œ 0
Preview
How low can you go? Short-read polishing of Oxford Nanopore bacterial genome assemblies - PubMed It is now possible to assemble near-perfect bacterial genomes using Oxford Nanopore Technologies (ONT) long reads, but short-read polishing is usually required for perfection. However, the effect of s...

Not included with myloasm, but check out pubmed.ncbi.nlm.nih.gov/38833287/ maybe?

08.09.2025 16:37 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Thanks Sergey!!

08.09.2025 16:36 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Indeed! Thanks @usadellab.bsky.social

08.09.2025 16:36 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Thanks @shenwei356.bsky.social !

08.09.2025 02:52 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
GitHub - bluenote-1577/myloasm: A new high-resolution long-read metagenome assembler for even noisy reads A new high-resolution long-read metagenome assembler for even noisy reads - bluenote-1577/myloasm

Thanks to co-authors @lh3lh3.bsky.social @mgmarin.bsky.social and the Heng Li lab here in Dana-Farber / Harvard Med.

Much thanks to all folks who generate/deposit data.

Building an assembler from scratch has always been a goal of mine, a labour of love :).

github.com/bluenote-157...

END

07.09.2025 23:34 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
GitHub - bluenote-1577/myloasm: A new high-resolution long-read metagenome assembler for even noisy reads A new high-resolution long-read metagenome assembler for even noisy reads - bluenote-1577/myloasm

In conclusion:

1. Check out our new long-read metagenome assembler github.com/bluenote-157.... It's written from scratch, in rust!

2. Myloasm excels on ONT R10.4 data, but works for HiFi too

3. I'm really excited by its ability to enable high-resolution sleuthing for microbiome genomics

11 / N

07.09.2025 23:34 β€” πŸ‘ 7    πŸ” 0    πŸ’¬ 2    πŸ“Œ 0
Post image

On a public oral ONT metagenome (from @ykiguchi.bsky.social), we assembled a lot more complete, similar (within species-level) genomes than previous methods.

So much to explore... for example, we compared 6 circular TM7 bacteria of > 93% ANI assembled from a single oral metagenome.

10 / N

07.09.2025 23:34 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

For this gut sample, @mgmarin.bsky.social found two distinct ermF (erythromycin resistance) genes, with 98% similarity, spreading within Bacteroidota.

1. The distinct ermFs are spreading on two distinct MGEs.
2. There is even strain specificity, only 1/6 P. copri had it!

9 / N

07.09.2025 23:34 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

With circular contigs, we can confidently analyze presence / absence of "stuff" within contigs _without worrying about binning issues_ (as much).

For example, mobile genetic elements, AMR genes that are hard to bin and assemble with short reads...?

8/N

07.09.2025 23:34 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Does myloasm offer better insights, not just good benchmarks?

It turns out myloasm can recover more near-complete contigs than other ONT methods.

For a gut sample, it could assemble _6 different Prevotella copri genomes_ into single contigs, whereas other methods struggled.

7 / N

07.09.2025 23:34 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

There's a lot more benchmarking, plasmids, misassemblies, contamination, etc. We are also improving, updating myloasm, especially for polishing.

I'll skip this in this thread, but see preprint for details. I'll focus on an interesting story about strain diversity instead...

6 / N

07.09.2025 23:34 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

My favourite result:

For jointly-sequenced gut samples (thanks to public data from @jjminich.bsky.social), ONT can assemble _more_ circular contigs than HiFi.

This is thanks to ~3-5x increases in circular contigs relative to previous methods.

5 / N

07.09.2025 23:34 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

@jimshaw is following 20 prominent accounts