Frederick "Erick" Matsen's Avatar

Frederick "Erick" Matsen

@matsen.bsky.social

I ♥ evolution, immunology, math, & computers. Professor at Fred Hutch & Investigator at HHMI. http://matsen.fredhutch.org/

549 Followers  |  125 Following  |  41 Posts  |  Joined: 01.11.2023  |  2.1481

Latest posts by matsen.bsky.social on Bluesky

Post image

... and second is to have a map from the figures to where they are made in the associated "experiments" code repository (github.com/matsengrp/dn...):

25.09.2025 18:03 — 👍 2    🔁 0    💬 0    📌 0
Post image

I forgot to post two things I liked doing in this paper that I hope catch on. First is to have links in the methods section to the model fitting code (in a tagged version github.com/matsengrp/ne... as the code continues to evolve):

25.09.2025 18:03 — 👍 1    🔁 0    💬 1    📌 0
Post image

Oh, and here is a picture of a cyborg-Darwin (cooked up by Gemini), after he realized how useful transformers are. For some reason MBE didn't want it as a cover image!

24.09.2025 22:24 — 👍 0    🔁 0    💬 1    📌 0

Many thanks to Kevin Sung and Mackenzie Johnson for leading the all-important task of data prep, Will Dumm for code and methods contributions, David Rich for structural work, and Tyler Starr, Yun Song, Phil Bradley, Julia Fukuyama, and Hugh Haddox for conceptual help.

24.09.2025 22:24 — 👍 0    🔁 0    💬 1    📌 0

We have positioned our group in this niche: we want to answer biological questions using ML-supercharged versions of the methods that scientists have been using for decades to derive insight.

More in this theme to come!

24.09.2025 22:24 — 👍 0    🔁 0    💬 1    📌 0

Stepping back, I think that transformers and their ilk have so much to offer fields like molecular evolution. Now we can parameterize statistical models using a sequence as an input!

24.09.2025 22:24 — 👍 0    🔁 0    💬 1    📌 0
Preview
netam/notebooks/dnsm_demo.ipynb at main · matsengrp/netam Neural networks to model BCR affinity maturation. Contribute to matsengrp/netam development by creating an account on GitHub.

If you want to give it a try, we have made it available using a simple `pretrained` interface. Here is a demo notebook. github.com/matsengrp/n...

24.09.2025 22:24 — 👍 0    🔁 0    💬 1    📌 0
Post image

And because natural selection is predicted for individual sequences, we can also investigate changes in selection strength as a sequence evolves down a tree:

24.09.2025 22:24 — 👍 0    🔁 0    💬 1    📌 0
Post image

Because this model isn't constrained to work with a fixed-width multiple sequence alignment we can do things like look at per-site selection factors on sequences with varying CDR3 length:

24.09.2025 22:24 — 👍 1    🔁 0    💬 1    📌 0

If a selection factor at a given site for a given sequence is

• > 1 that is diversifying selection
• = 1 that is neutral selection
• < 1 that is purifying selection.

24.09.2025 22:24 — 👍 0    🔁 0    💬 1    📌 0

The model is above. In many ways it is like a classical model of mutation and selection, but the mutation model is a convolutional model and the selection model is a transformer-encoder mapping from AA sequences to a vector of selection factors of the same length as the sequence.

24.09.2025 22:24 — 👍 0    🔁 0    💬 1    📌 0
Post image

The final version of our transformer-based model of natural selection has come out in MBE. I hope some molecular evolution researchers find this interesting & useful as a way to express richer models of natural selection. doi.org/10.1093/mol... (short 🧵)

24.09.2025 22:24 — 👍 31    🔁 8    💬 1    📌 0
Preview
AI Engineer - Evolutionary Protein Language Models Primary Work Address: 19700 Helix Drive, Ashburn, VA, 20147 Current HHMI Employees, click here to apply via your Workday account. Intro: AI@HHMI: HHMI is investing $500 million over the next 10 years ...

We are looking for an #AIEngineer to help build protein language models that capture evolutionary constraints with @matsen.bsky.social and @jbloomlab.bsky.social at #AI@HHMI @hhmijanelia.bsky.social
hhmi.wd1.myworkdayjobs.com/en-US/Extern...

18.09.2025 16:50 — 👍 9    🔁 5    💬 0    📌 0

Hats off to first author Kevin Sung www.linkedin.com/in/kevinsun... and the rest of the team 🙏 !

18.09.2025 22:46 — 👍 0    🔁 0    💬 0    📌 0
Preview
Peer review in Thrifty wide-context models of B cell receptor somatic hypermutation Convolutional embedding models efficiently capture wide sequence context in antibody somatic hypermutation, avoiding exponential k-mer parameter scaling and eliminating the need for per-site modeling.

I was very proud to get "The authors are to be commended for their efforts to communicate with the developers of previous models and use the strongest possible versions of those in their current evaluation" in peer reviews:

elifesciences.org/articles/10...

18.09.2025 22:46 — 👍 0    🔁 0    💬 1    📌 0
Preview
GitHub - matsengrp/thrifty-experiments-1 Contribute to matsengrp/thrifty-experiments-1 development by creating an account on GitHub.

Pretrained models are available at github.com/matsengrp/n..., and the computational experiments are at github.com/matsengrp/t....

18.09.2025 22:46 — 👍 0    🔁 0    💬 1    📌 0

It's possible that more complex models not more significantly dominating comes from a lack of suitable training data, namely neutrally evolving out-of-frame sequences. We tried to augment the training data, with no luck.

18.09.2025 22:46 — 👍 1    🔁 0    💬 1    📌 0

The resulting models are better than 5-mer models, but only modestly so. We made many efforts to include a per-site rate but concluded that the effects of such a rate were weak enough that including them did not improve model performance.

18.09.2025 22:46 — 👍 0    🔁 0    💬 1    📌 0
Post image

Solution: first embed 3-mers and then the number of parameters goes up only linearly with the context width.

18.09.2025 22:46 — 👍 0    🔁 0    💬 1    📌 0
Preview
Thrifty wide-context models of B cell receptor somatic hypermutation Convolutional embedding models efficiently capture wide sequence context in antibody somatic hypermutation, avoiding exponential k-mer parameter scaling and eliminating the need for per-site modeling.

The final version of our "Thrifty" paper is up now: elifesciences.org/articles/10... .

We were motivated to fit wide-context mutation models based on previous analyses showing "mesoscale" effects and a position-specific effect. But, how to avoid exploding the number of parameters? 🧵

18.09.2025 22:46 — 👍 7    🔁 2    💬 1    📌 0
Preview
GitHub - matsengrp/preflight Contribute to matsengrp/preflight development by creating an account on GitHub.

Is an idea likely to advance the field?

Our "preflight check" exercise provides a structured approach for thinking through computational biology research projects.

github.com/matsengrp/pr...

Thanks to @sdwfrost.bsky.social for the core idea!

16.09.2025 14:06 — 👍 5    🔁 0    💬 0    📌 0
Preview
Dear future trainee: Let's have fun, work hard, and feel lucky that our job is to expand the boundary of knowledge.

Interested in doing a PhD or postdoc in our group? Here is a letter to you: matsen.group/general/202...

We are ready to recruit a trainee who can help develop the next generation of our transformer-based models of natural selection. See the "joining" tab of our website for details.

21.08.2025 09:47 — 👍 5    🔁 3    💬 1    📌 1

Open bioinformatics position on next-generation protein evolution models! Join HHMI's AI initiative at Janelia Farm, Virginia, (an amazing place) and work closely with our team. Help us build the future! 🧬 + 🤖 = ❤️
hhmi.wd1.myworkdayjobs.com/en-US/Exter...

20.08.2025 11:17 — 👍 1    🔁 0    💬 0    📌 0
Preview
The term 'affinity maturation' understates the influence of somatic hypermutation Three recent papers quantify how nucleotide-level mutation processes drive antibody evolution.

Why does selection feel so weak relative to mutation in affinity maturation? A new blog post giving three perspectives, including our new transformer-based model of natural selection on antibodies: matsen.group/general/202...

19.08.2025 17:17 — 👍 22    🔁 9    💬 0    📌 0
Preview
Inference of germinal center evolutionary dynamics via simulation-based deep learning B cells and the antibodies they produce are vital to health and survival, motivating research on the details of the mutational and evolutionary processes in the germinal centers (GC) from which mature...

In a new preprint we use deep learning on lineage trees to infer the functional form of the relationship between affinity and fitness that controls antibody evolution in germinal centers: arxiv.org/abs/2508.09871 🧵

16.08.2025 22:55 — 👍 15    🔁 9    💬 1    📌 0
Post image

Here are some useful subagents we've developed for Claude Code.
github.com/matsengrp/c...
(description in README if you don't know what I'm talking about)

Example uses:

14.08.2025 13:40 — 👍 1    🔁 0    💬 0    📌 0
Preview
GitHub - matsengrp/pdf-navigator-mcp: Comprehensive MCP server for PDF reading, navigation, and text search Comprehensive MCP server for PDF reading, navigation, and text search - matsengrp/pdf-navigator-mcp

Motivated by wanting Claude Code to read papers, and something to fill PDF forms for kid summer camps, I vibe-coded github.com/matsengrp/p...

Perhaps you will find it useful!

03.08.2025 14:17 — 👍 2    🔁 0    💬 0    📌 0

Go Maggie! @magdalenarussell.bsky.social gets UW's Distinguished Dissertation Award for her PhD work with @matsen.bsky.social "Inferring mechanisms of V(D)J recombination using statistical inference on high-throughput immune repertoire data". 🏆

03.07.2025 22:19 — 👍 11    🔁 1    💬 0    📌 0

Excited to share my new preprint developed with @matsen.bsky.social, in collaboration with Marius Brusselmans, Luiz Carvalho, @msuchard.bsky.social, and @guybaele.bsky.social, on the biological causes and impacts of tree space ruggedness in phylodynamic inference. 1/
www.biorxiv.org/content/10.1...

17.06.2025 01:25 — 👍 7    🔁 7    💬 1    📌 0
Preview
Replaying germinal center evolution on a quantified affinity landscape Darwinian evolution of immunoglobulin genes within germinal centers (GC) underlies the progressive increase in antibody affinity following antigen exposure. Whereas the mechanics of how competition be...

Wanted to highlight our latest preprint--a huge effort by multiple people and labs, but led primarily by @wsdewitt.github.io, Tatsuya Araki, and Ashni Vora, in a very close wet-dry collaboration with @matsen.bsky.social’s lab at the Hutch

www.biorxiv.org/content/10.1...

05.06.2025 14:28 — 👍 66    🔁 28    💬 1    📌 5

@matsen is following 20 prominent accounts