It was suggested that the audience may not appreciate/understand :(
03.10.2025 06:22 β π 0 π 0 π¬ 2 π 0@jmschreiber91.bsky.social
Studying genomics, machine learning, and fruit. My code is like our genomes -- most of it is junk. Assistant Professor UMass Chan, Board of Directors NumFOCUS Previously IMP Vienna, Stanford Genetics, UW CSE.
It was suggested that the audience may not appreciate/understand :(
03.10.2025 06:22 β π 0 π 0 π¬ 2 π 0is it a good idea to wear a "join, or die!" hat to a big talk in europe? please say yes
30.09.2025 18:15 β π 3 π 0 π¬ 1 π 0the greatest productivity hack is having a grant deadline. there's so much other stuff you can do when you're supposed to be working on a grant.
22.09.2025 14:05 β π 15 π 0 π¬ 0 π 0I was delighted to have the unexpected opportunity to give a keynote at MLCB 2025 in NYC last week. I used it to explain how I view deep learning models in genomics not as "uninterpretable black boxes" but as indispensable tools for understanding genomics + designing the next gen of synthetic DNA.
19.09.2025 13:59 β π 11 π 1 π¬ 0 π 0for some reason i thought being a professor would involve more mentoring and research and less filling out disclosures concerning whether plants and seeds were used in my computational study
09.09.2025 19:24 β π 10 π 1 π¬ 0 π 0stocking up the new apartment with essentials
03.09.2025 14:24 β π 4 π 0 π¬ 0 π 0In the genomics community, we have focused pretty heavily on achieving state-of-the-art predictive performance.
While undoubtedly important, how we *use* these models after training is potentially even more important.
tangermeme v1.0.0 is out now. Hope you find it useful!
For some reason, hitting "comment" on GitHub is significantly more responsive than a month ago and it freaks me out. Surely there are some important calculations that need to be done before letting my thoughts into the wild?
27.08.2025 21:43 β π 1 π 0 π¬ 0 π 0Thanks! Let me know if you want me to stop in virtually, we can try to figure out a time.
27.08.2025 17:15 β π 1 π 0 π¬ 0 π 0Hope you find tangermeme helpful in your work! Please reach out if you have any comments + questions.
27.08.2025 16:44 β π 0 π 0 π¬ 0 π 0Because everything is automatic, we can probe models.
What motifs are driving model predictions? Calculate attributions, call + annotate seqlets, and count the annotations!
BPNet is relying on MYC, whereas Beluga is relying on many more TFs. Easy comparison now.
Frequently, people manually annotate seqlets and draw bars or boxes around these high-attribution characters themselves. This is not really a problem, but it's just slow and does not scale genome-wide.
In the above picture, everything is automatically done.
People *talk* about seqlets a lot but tangermeme is the first package for complete functionality.
Here is a complete example of using tangermeme for attributions, seqlet calling + annotation, and plotting, to visualize what five models think of the same locus
Expanding past these implementations, tangermeme has a large focus on automatic seqlet calling and usage. Seqlets are short contiguous spans of high-attribution characters that usually correspond to the binding of a TF.
27.08.2025 16:39 β π 0 π 0 π¬ 1 π 0By considering attributions you can see how variants disrupt or change usage of motifs. Maybe you'll even find that a variant causes alternative binding by inducing a new motif or slightly changing competition! That would be challenging to see from the predictions alone.
27.08.2025 16:36 β π 1 π 0 π¬ 1 π 0Past simply re-implementing algorithms people use (in a convenient repo), tangermeme offers flexibility not usually offers in other implementations.
As an example, instead of calculating variant effect as predictions before/after a substitution, why not look at attributions?
This care extends to each of our operations. For example, one-hot encoding the entirety of chr1 takes <2s on a single thread. This is significantly faster than other one-hot encoding methods out there, and is fast enough to enable real-time batch generation from FASTAs.
27.08.2025 16:33 β π 3 π 0 π¬ 1 π 0Here is a (twitter) thread on the issue:
x.com/jmschreiber9...
By focusing in this manner, we can "delve" deeply into these downstream algorithms. For instance, we found a bug in many DeepLIFT/SHAP implementations that will cause them to silently fail when you don't register your operations. Didn't know you needed to do that? Same!
27.08.2025 16:29 β π 2 π 0 π¬ 1 π 0This design choice is intentional. Because model components and training strategies are hugely variable and evolving quickly, I did not even want to touch those aspects.
You define and train your model however you want, and then use tangermeme to do genomic discovery with it.
tangermeme is a toolkit that implements "everything-but-the-model" for genomic machine learning.
This includes sequence manipulations, batched predictions, attributions, ablations, marginalizations, variant effect prediction, design, etc...
Preprint: biorxiv.org/content/10.1...
Installation: `pip install tangermeme`
In the genomics community, we have focused pretty heavily on achieving state-of-the-art predictive performance.
While undoubtedly important, how we *use* these models after training is potentially even more important.
tangermeme v1.0.0 is out now. Hope you find it useful!
It's about both. Conceptually, they are intertwined.
26.08.2025 18:38 β π 1 π 0 π¬ 0 π 0An excellent post about the receptive range of convolution models.
"You might reasonably ask: "If I have 100 layers with W=1000W=1000, that's a theoretical receptive field of 100,000 tokens. Doesn't that matter?"
The answer is no, and here's why:"
guangxuanx.com/blog/stackin...
Multiple, in fact
25.08.2025 19:53 β π 0 π 0 π¬ 0 π 0First week as a faculty was successful: one grant submitted, one paper submitted, and revision requests back on one paper.
If we extrapolate, by the time I'm up for tenure I'll have 260 grants submitted (none awarded) and 260 papers submitted/reviewed (none accepted). `
lgtm
Almost a year ago I submitted a NIH grant and federal funding collapsed. Continuing on that success, I'm proud to announce that I've just submitted a local grant...
20.08.2025 19:52 β π 10 π 0 π¬ 0 π 0Very wide-ranging. I am impressed with it!
20.08.2025 12:46 β π 1 π 0 π¬ 0 π 0now i get to be happy, right?
20.08.2025 12:25 β π 10 π 0 π¬ 0 π 0