Elana Simon's Avatar

Elana Simon

@elanasimon.bsky.social

384 Followers  |  10 Following  |  9 Posts  |  Joined: 15.11.2024  |  1.6758

Latest posts by elanasimon.bsky.social on Bluesky

Preview
InterPLM: Discovering Interpretable Features in Protein Language Models via Sparse Autoencoders Protein language models (PLMs) have demonstrated remarkable success in protein modeling and design, yet their internal mechanisms for predicting structure and function remain poorly understood. Here w...

For more information, check the preprint! (9/9)
www.biorxiv.org/content/10.1101/2024.11.14.623630v1

19.11.2024 19:34 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Preview
GitHub - ElanaPearl/InterPLM: Discovering Interpretable Features in Protein Language Models via Sparse Autoencoders Discovering Interpretable Features in Protein Language Models via Sparse Autoencoders - ElanaPearl/InterPLM

๐Ÿ› ๏ธ Want to analyze your own protein models? (8/9)
- Code: github.com/ElanaPearl/interPLM
- Full framework for PLM interpretation
- Methods for training, analysis, and visualization

19.11.2024 19:34 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Video thumbnail

โœจ Explore the features yourself! (7/9)
- Interactive visualization: interplm.ai
- Explore features from every layer of ESM-2-8M
- See how proteins activate different features
- Examine structural patterns

19.11.2024 19:34 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

๐Ÿงช We can also steer model predictions by adjusting feature activations, demonstrating how understanding these representations could help guide protein design (6/9)

19.11.2024 19:34 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

๐ŸŽฏ Beyond understanding PLMs, these features have practical applications (5/9):
Finding missing annotations in protein databases
Identifying potentially new biological motifs
Suggesting locations of binding sites and functional regions

19.11.2024 19:34 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

๐Ÿค– We showed LLMs can generate meaningful descriptions of many features - and these descriptions can be validated by successfully predicted which proteins would activate each feature! (4/9)

19.11.2024 19:34 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

๐Ÿ“ŠWe identified up to 2,548 interpretable features per layer that match known biological concept annotations - compared to just 46 from individual neurons.

This suggests PLMs store biological information in superposition - multiple concepts sharing the same neurons! (3/9)

19.11.2024 19:34 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

๐Ÿ” Using InterPLM, we identified features in ESM-2 that detect various biological properties, from local motifs to complex structural patterns (2/9)
- Catalytic sites
- Zinc fingers
- Targeting sequences
- Post-translational modifications
- Structural elements and many more!

19.11.2024 19:34 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

๐Ÿงฌ What are protein language models (PLMs) actually learning about biology? Our paper introduces InterPLM - a framework that reveals interpretable features in PLMs using sparse autoencoders, giving us a window into how these models represent protein structure and function.
๐Ÿงต(1/8)

19.11.2024 19:34 โ€” ๐Ÿ‘ 12    ๐Ÿ” 2    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 1
Preview
InterPLM: Discovering Interpretable Features in Protein Language Models via Sparse Autoencoders Protein language models (PLMs) have demonstrated remarkable success in protein modeling and design, yet their internal mechanisms for predicting structure and function remain poorly understood. Here w...

www.biorxiv.org/content/10.1...

InterPLM: Discovering Interpretable Features in Protein Language Models via Sparse Autoencoders

Code: github.com/ElanaPearl/I...

Interactive site: interplm.ai

Nice work by Elana Simon from James Zou lab

19.11.2024 02:39 โ€” ๐Ÿ‘ 41    ๐Ÿ” 6    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 0
Overview of SAE methodology and representative SAE features revealed through automated activation
pattern analysis

Overview of SAE methodology and representative SAE features revealed through automated activation pattern analysis

Using mechanistic interpretability to steer generations

Using mechanistic interpretability to steer generations

SAE feature analysis and visualizations reveal features with diverse and consistent activation patterns

SAE feature analysis and visualizations reveal features with diverse and consistent activation patterns

Mechanistic interpretability on a protein language model

www.biorxiv.org/content/10.1...

18.11.2024 22:17 โ€” ๐Ÿ‘ 48    ๐Ÿ” 15    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

@elanasimon is following 10 prominent accounts