Peter Koo's Avatar

Peter Koo

@pkoo562.bsky.social

AI4Science researcher. Associate Professor @CSHL. My lab advances AI for genomics and healthcare! http://koo-lab.github.io

3,171 Followers  |  1,272 Following  |  129 Posts  |  Joined: 04.12.2023
Posts Following

Posts by Peter Koo (@pkoo562.bsky.social)

Preview
90th Symposium: AI in Biology Cold Spring Harbor Laboratory Meetings & Courses -- a private, non-profit institution with research programs in cancer, neuroscience, plant biology, genomics, bioinformatics.

Save the date: The premier AI x Bio meeting of 2026 will be held at CSHL on May 26-31!

The program brings together 50+ invited leaders in genomics, transcriptomics, protein design, drug discovery, neuro-AI, pathology, agentic AI, and more!

Abstract due: March 26

meetings.cshl.edu/meetings.asp...

04.01.2026 00:03 โ€” ๐Ÿ‘ 10    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Decoding the regulatory genome with large-scale deep learning Nature Reviews Genetics, Published online: 03 November 2025; doi:10.1038/s41576-025-00914-2In this Journal Club, Peter Koo reflects on the 2021 publication of Enformer and its impact on the use of deep learning for modelling the regulatory genome.

New online! Decoding the regulatory genome with large-scale deep learning

03.11.2025 13:07 โ€” ๐Ÿ‘ 6    ๐Ÿ” 2    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Post image

Beware of LLM blindspots. #AI4Science

08.11.2025 21:24 โ€” ๐Ÿ‘ 5    ๐Ÿ” 1    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Post image Post image

Yijie Kang (CSHL, Stony Brook) from @pkoo562.bsky.social Lab presented "Decoding the sequence basis of Pol II elongation with deep learning"

07.11.2025 15:05 โ€” ๐Ÿ‘ 2    ๐Ÿ” 1    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

Exciting symposium on AI and Biology at EMBO | EMBL in Heidelberg on 10-13 March 2026!

Excellent lineup of invited speakers across various scales of biology!

Deadline for abstract submission is coming up โ€” Dec 2.

๐Ÿ”— www.embl.org/about/info/c...

#EESAIBio @EMBLEvents

07.11.2025 00:16 โ€” ๐Ÿ‘ 9    ๐Ÿ” 7    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 1
Preview
Interpreting cis-regulatory mechanisms from genomic deep neural networks using surrogate models - Nature Machine Intelligence The intersection of genomics and deep learning shows promise for real impact on healthcare and biological research, but the lack of interpretability in terms of biological mechanisms is limiting utility and further development. As a potential solution, Koo et al. present SQUID, an interpretability framework built using domain-specific genomic surrogate models.

This was led by @EESeitz
, a former postdoc who was jointly advised by me and Justin Kinney at CSHL, and also in collaboration with David McCandlish @TheDMMcC

It's a beautiful followup to SQUID, our surrogate modeling approach to interpret genomic DNNs!

www.nature.com/articles/s42...

09.10.2025 12:08 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Preview
Uncovering the Mechanistic Landscape of Regulatory DNA with Deep Learning The regulatory genome encodes the logic that governs gene expression, enabling cells to respond to developmental, environmental, and evolutionary cues. This logic arises from complex cis-regulatory me...

Check out the paper to find out more!

Code: github.com/evanseitz/se...
Paper: www.biorxiv.org/content/10.1...

09.10.2025 12:08 โ€” ๐Ÿ‘ 3    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

We find regulatory DNA is readily reprogrammable with a few key mutations! We observed similar phenomenon across all genomic DNNs we tested! 12/N

09.10.2025 12:08 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

SEAM is a powerful tool that helps to: 1) explore the high evolvability landscape of regulatory sequences; 2) identifies mutations that drive mechanistic changes; and 3) dissect motif syntax and context dependencies. 11/N

09.10.2025 12:08 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

And we tested the backgrounds across different ChromBPNet models independently trained on DNase-seq and ATAC-seq and we observe similar backgrounds! This suggests these mutagenesis-robust patterns are important context that reflects properties of the local sequence space. 10/N

09.10.2025 12:08 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

While previous analyses focused on differences in attr maps across clusters, a surprising observation was that there were also shared patterns. We disentangled the attribution signals that are sensitive versus robust to mutagenesis โ€“ we call them foreground and background. 9/N

09.10.2025 12:08 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

This analysis flagged 2 key mutations at positions 170 & 174 that created a new CAAT box. To test necessity & sufficiency, we mutated each individually and together, then examined attr maps+ predictions
- single mutations -> no change
- double mutation -> CAAT box + new Inr

8/N

09.10.2025 12:08 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

Applying SEAM to CLIPNET, which predicts transcriptional activity measured via PRO-cap, we find that many SNVs lead to new clusters in the PIK3R3 promoter. A few specific mutations can quantitatively tune gene expression and SEAM can find them! 7/N

09.10.2025 12:08 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

Now, if we plot the percent mismatch of the nucleotides with respect to WT for each cluster, you can see yellow bars that reflect all sequences in the cluster share the same single nucleotide mutation. This analysis pinpoints the exact mutation that led to the new mechanism! 6/N

09.10.2025 12:08 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

Low entropy reflects all the sequences share the same nucleotides, while high entropy reflects different mutations destroyed the motif. Sometimes, we see motif preserving signature outside the vertical bands. This represents a de novo motif that appeared within that cluster. 5/N

09.10.2025 12:08 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image Post image

If we calculate the positional entropy of the sequences within each cluster, we get a cluster summary matrix. The vertical bands highlight the locations of the motifs in WT seq and entropy levels indicate whether the motif is present or not in the attr maps in each cluster. 4/N

09.10.2025 12:08 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

Attr maps can sometimes be easy to interpret, and sometimes they're complex. SEAM's clustered attr maps are cleaner (think SmoothGrad) and they decompose complex mechanisms via partial random mutagenesis, which occasionally disrupts key binding sites. 3/N

09.10.2025 12:08 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

SEAM is conceptually simple. Starting from a reference sequence:
1) sample in a local region of sequence space via partial random mutagenesis
2) calculate attr maps to unveil the mechanisms
3) cluster attr maps based on shared mechanisms
4) cluster-based sequence analysis

2/N

09.10.2025 12:08 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

Which mutations rewire function of regulatory DNA?

Excited to share SEAM: Systematic Explanation of Attribtuion-based Mechanisms. SEAM is an explainable AI method that dissects cis-regulatory mechanisms learned by seq2fun genomic deep learning models.

Led by @EESetiz

1/N ๐Ÿงต๐Ÿ‘‡

09.10.2025 12:02 โ€” ๐Ÿ‘ 29    ๐Ÿ” 10    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

Congratulations to John Clarke, Michel Devoret and John Martinis on receiving the 2025 Nobel Prize in Physics!
www.nobelprize.org/prizes/physi...

I have fond memories of my time in the Clarke lab, where I did my Honors Thesis on ultra low-field MRI w/ SQUIDs as an undergrad at UC Berkeley!

07.10.2025 14:16 โ€” ๐Ÿ‘ 9    ๐Ÿ” 1    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Check out a Research Highlights on our work at @naturemethods by Lin Tang!

www.nature.com/articles/s41...

19.09.2025 16:36 โ€” ๐Ÿ‘ 7    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Post image

Richard Bonneau giving the last keynote on navigating the complexity of drug discovery and their lab-in-the-loop for molecule design! #MLCB

11.09.2025 17:40 โ€” ๐Ÿ‘ 3    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Post image

First talk a (surprise) keynote by Jacob Schreiber from UMass Medical talking about fruit-themed AI tools for understanding and designing regulatory DNA

11.09.2025 13:44 โ€” ๐Ÿ‘ 3    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

2025 MLCB day 2 is starting now!

Streaming live now!
m.youtube.com/watch?v=PxlXNbโ€ฆ

11.09.2025 13:42 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

Now Barbara Engelhardt giving a keynote on characterizing behaviors of modified T cells in live cell imaging data using machine learning!

10.09.2025 17:58 โ€” ๐Ÿ‘ 4    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Post image

Next talk by Courtney Shearer who is talking about genomic language models for zero shot promoter indel effects!

10.09.2025 15:15 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

Next talk by Alan Murphy and Masayuki (Moon) Nagai (from my lab!) who are talking about how naive fine-tuning genomic DNNs leads to catastrophic forgetting and propose *iterative causal refinement* to improve learned associations to causal understanding of cis-regulatory biology!

10.09.2025 14:53 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

Next talk by Johannes Linder at Calico. Talking about expanding genomic seq2fun DNNs with RBP binding and RNA processing data to consider post-transcriptional regulation.

10.09.2025 14:38 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

Some technical delays but we are all set!

First talk by Alexis Battle! @alexisbattle.bsky.social

10.09.2025 13:52 โ€” ๐Ÿ‘ 6    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Preview
Machine Learning in Computational Biology 2025 YouTube video by Machine Learning in Computational Biology

Here is the YouTube live link:

www.youtube.com/live/19I7xTh...

Starts at 9:30a!

10.09.2025 13:05 โ€” ๐Ÿ‘ 4    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0