Noam Teyssier's Avatar

Noam Teyssier

@noamteyssier.bsky.social

Bioinformatics Scientist at the Arc Institute. Working at the intersection of functional genomics, systems biology, and network dynamics. I also build rusty bioinformatics tools https://github.com/noamteyssier

138 Followers  |  95 Following  |  56 Posts  |  Joined: 14.11.2024  |  2.1131

Latest posts by noamteyssier.bsky.social on Bluesky

The workspace publishing has been such a hassle. So glad to see this out

18.09.2025 14:40 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Sounds great! Would be very interested in that and happy to help build one

17.09.2025 14:18 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

bsky.app/profile/noam...

Here was a benchmark I ran a while back comparing twobit and binseq on a single-thread

15.09.2025 17:15 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

2bit was built for genomes where there are very long contiguous N-blocks. the overhead for managing these blocks though on fastq-style records (generally very short and non-contiguous Ns) is massive and most of the time unnecessary.

15.09.2025 17:13 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Paraseq 0.4 is out now! With double the throughput for processing paired-end input :)

github.com/noamteyssier...

04.09.2025 22:41 β€” πŸ‘ 15    πŸ” 8    πŸ’¬ 0    πŸ“Œ 1
Post image Post image

Added a feature to bqtools yesterday for colored grep output. Also supports colored FASTX output as well. Already useful this morning as I troubleshoot some sequencing outputs!

04.09.2025 17:56 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
CRISPR screening by AAV episome-sequencing (CrAAVe-seq): a scalable cell-type-specific in vivo platform uncovers neuronal essential genes - Nature Neuroscience The authors developed an adeno-associated virus-based high-throughput in vivo CRISPR screening platform for endogenous mouse brain cell types. Using this platform, they define genes and pathways essen...

Excited that the paper presenting our mouse brain in vivo CRISPR screening platform is out today in @natneuro.nature.com!

Great team effort, led by Biswa Ramani and @ivlrose.bsky.social in the Kampmann lab.

www.nature.com/articles/s41...

22.08.2025 22:15 β€” πŸ‘ 85    πŸ” 18    πŸ’¬ 3    πŸ“Œ 2
Preview
Accelerating k-mer-based sequence filtering The exponential growth of global sequencing data repositories presents both analytical challenges and opportunities. While k - mer-based indexing has improved scalability over traditional alignment fo...

Preprint alert!
We present K2Rmini, an ultra-fast, grep-like tool that extracts sequences of interest from FASTA/FASTQ files based on their k-mer content.
www.biorxiv.org/content/10.1...
A thread

02.07.2025 12:59 β€” πŸ‘ 37    πŸ” 19    πŸ’¬ 1    πŸ“Œ 0

Writing in rust again after a long stretch of python is such a breath of fresh air.

26.06.2025 02:47 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Are you going to have a remote component to this? Would love to watch some of these talks if I can

26.06.2025 01:28 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Ah this is the way that I do it in paraseq! Doesn't work for fastq headers but works well for fasta

24.06.2025 20:04 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

Introducing Arc Institute’s first virtual cell model: STATE

23.06.2025 17:28 β€” πŸ‘ 17    πŸ” 6    πŸ’¬ 1    πŸ“Œ 1
Preview
Downloaded more for business, or pleasure? This mini-project was inspired by this tweet: After which I spent about two hours making a small script that grabs data from the rust package repository crates.io, and analyses the ...

Pretty cool little utility and blog post - fun to see the business/pleasure index for rust crates

boydkane.com/projects/cra...

18.06.2025 20:32 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

Preprint on "Improving spliced alignment by modeling splice sites with deep learning". It describes minisplice for modeling splice signals. Minimap2 and miniprot now optionally use the predicted scores to improve spliced alignment.
arxiv.org/abs/2506.12986

17.06.2025 01:48 β€” πŸ‘ 109    πŸ” 54    πŸ’¬ 0    πŸ“Œ 1

R.I.P your email inbox haha

16.06.2025 16:25 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

New preprint! Deacon is a versatile tool for filtering FASTA/FASTQ files and streams at hundreds of megabases per second using minimizers, built with rapid metagenomic host depletion in mind, but equally useful for search.
github.com/bede/deacon

13.06.2025 13:24 β€” πŸ‘ 57    πŸ” 40    πŸ’¬ 5    πŸ“Œ 2

ish is a grep-like CLI tool that uses optimal alignment instead of exact matching.

It’s record-type aware, supporting line, FASTA, and FASTQ records.

Built in Mojo as a proof of concept for bioinformatics.

🧡1/5

09.06.2025 13:05 β€” πŸ‘ 44    πŸ” 24    πŸ’¬ 1    πŸ“Œ 2
Preview
Bon Next-gen compile-time-checked builder generator, named function's arguments, and more!

A good workaround for defaults I use sometimes is Bon. Adds to compile times though which can be annoying

bon-rs.com

08.06.2025 01:20 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

lol what expires in this? It’s like pure metal

07.06.2025 14:59 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Slides from my talk (with @kamilsjaron.bsky.social) on an history of k-mers in bioinformatics: rayan.chikhi.name/pdf/2025-kme...

03.06.2025 09:25 β€” πŸ‘ 44    πŸ” 24    πŸ’¬ 1    πŸ“Œ 2

Love seeing audio stuff in rust. How’d you make the visualization?

28.05.2025 17:30 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

πŸ“œ Excited to share insights from our recent paper: "Kaminari: a resource-frugal index for approximate colored k-mer queries". The study aims to efficiently identify documents containing a query string, focusing on DNA strings. www.biorxiv.org/content/10.1... 🧬 πŸ–₯️ 1/8

27.05.2025 12:06 β€” πŸ‘ 24    πŸ” 16    πŸ’¬ 1    πŸ“Œ 1

One of the great success stories of change haha

23.05.2025 16:14 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

I think the best way to spur change is to make the new solution as easy as the old one. If it's an easy swap then I think its people will try it out and convince themselves its worth it.

Like swapping out std::collections::HashMap for hashbrown::HashMap.

But its easier said than done

23.05.2025 16:10 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Oftentimes inertia is the biggest reason for lack of change. If things work as they are people are unlikely to change.

23.05.2025 15:47 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
Cell simulation as cell segmentation - Nature Methods Proseg is a segmentation approach for single-cell spatially resolved transcriptomics data that uses unsupervised probabilistic modeling of the spatial distribution of transcripts to accurately segment...

Our Proseg paper is now out in Nature Methods!
www.nature.com/articles/s41...

We borrowed a sampling procedure from the cell simulation literature to infer cell boundaries that best explains the spatial distribution of transcripts.

22.05.2025 17:52 β€” πŸ‘ 6    πŸ” 2    πŸ’¬ 1    πŸ“Œ 0
Preview
lofi Archive radio 🎞️ beats to scan/read microfiche to YouTube video by Internet Archive

πŸ“„ The scanners are humming, the film is flowing.

The microfiche livestream is upβ€”digitizing government docs in real time for Democracy’s Library.

Perfect second-screen vibes: Preservation in progress.

πŸ•’ Live M-F, 7:30am–3:30pm PT (except U.S. holidays)
➑️ www.youtube.com/live/aPg2V5R...

22.05.2025 14:37 β€” πŸ‘ 357    πŸ” 74    πŸ’¬ 4    πŸ“Œ 11

So yeah, this is why I keep going on about: do we have to sanitize user input or not? File formats where bad inputs are simply not representable are good, because it saves us from this 100x slowdown.

16.05.2025 17:51 β€” πŸ‘ 4    πŸ” 2    πŸ’¬ 0    πŸ“Œ 0

One other option I’d be curious about is an unreachable!()

16.05.2025 15:10 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

I feel like -1 would lead to some smaller assembly footprint… but super curious what the diff is

16.05.2025 03:28 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

@noamteyssier is following 19 prominent accounts