Laura Kopf's Avatar

Laura Kopf

@lkopf.bsky.social

PhD student in Interpretable Machine Learning at TU Berlin & BIFOLD

305 Followers  |  367 Following  |  14 Posts  |  Joined: 06.12.2024  |  1.7157

Latest posts by lkopf.bsky.social on Bluesky

Grateful to the institutions that supported this work:
@tuberlin.bsky.social
@bifold.berlin
UMI Lab
@fraunhoferhhi.bsky.social
@unipotsdam.bsky.social
@leibnizatb.bsky.social

(7/7)

19.06.2025 15:18 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Many thanks to my amazing co-authors:
@nfel.bsky.social
@kirillbykov.bsky.social
@philinelb.bsky.social
Anna HedstrΓΆm
Marina M.-C. HΓΆhne
@eberleoliver.bsky.social

(6/7)

19.06.2025 15:18 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Our results highlight that the PRISM framework not only provides multiple human interpretable descriptions for neurons but also aligns with the human interpretation of polysemanticity. (5/7)

19.06.2025 15:18 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

In exploring the concept space, we use PRISM to characterize more complex components, finding and interpreting patterns that specific attention heads or groups of neurons respond to. (4/7)

19.06.2025 15:18 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

We benchmark PRISM across layers and architectures, showing how polysemanticity and interpretability shift through the model. (3/7)

19.06.2025 15:18 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

PRISM samples sentences from the top percentile activation distribution, clusters them in embedding space, and uses an LLM to generate labels for each concept cluster. (2/7)

19.06.2025 15:18 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

πŸ” When do neurons encode multiple concepts?

We introduce PRISM, a framework for extracting multi-concept feature descriptions to better understand polysemanticity.

πŸ“„ Capturing Polysemanticity with PRISM: A Multi-Concept Feature Description Framework
arxiv.org/abs/2506.15538

🧡 (1/7)

19.06.2025 15:18 β€” πŸ‘ 34    πŸ” 12    πŸ’¬ 1    πŸ“Œ 3

Huge thanks to my incredible supervisor
@kirillbykov.bsky.social, who laid the foundation for this project and provided brilliant guidance πŸ™, and to @philinelb.bsky.social and Sebastian Lapuschkin, who unfortunately couldn’t be there.

13.12.2024 02:48 β€” πŸ‘ 5    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

Still overwhelmed by the amazing response to our poster session at @neuripsconf.bsky.social with Anna Hedstrâm and Marina Hâhne! It was incredible to have such lively and inspiring discussions with brilliant people whose work I admire. ✨

13.12.2024 02:48 β€” πŸ‘ 10    πŸ” 2    πŸ’¬ 1    πŸ“Œ 0

Thanks for putting together this amazing list Margaret! I would love to be added if you still have space :)

12.12.2024 08:24 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Want to know more about CoSy?
πŸ“„ Paper: arxiv.org/abs/2405.20331
πŸ’» Code: github.com/lkopf/cosy
πŸ”— Poster: neurips.cc/virtual/2024...

#NeurIPS2024 #MechInterp #ExplainableAI #Interpretability

11.12.2024 06:43 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Special thanks to our supporting institutions: UMI Lab, @xtraexer.bsky.social, @tuberline.bsky.social, Uni Potsdam, ATB Potsdam, and Fraunhofer Heinrich-Hertz-Institut.

11.12.2024 06:43 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

My co-authors Anna HedstrΓΆm and Marina HΓΆhne will also be at @neuripsconf.bsky.social. A big thank you to my other co-authors @kirillbykov.bsky.social, @philinelb.bsky.social and Sebastian Lapuschkin, who unfortunately couldn’t be there.

11.12.2024 06:43 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

I’ll be presenting our work at @neuripsconf.bsky.social in Vancouver! πŸŽ‰
Join me this Thursday, December 12th, in East Exhibit Hall A-C, Poster #3107, from 11 a.m. PST to 2 p.m. PST. I'll be discussing our paper β€œCoSy: Evaluating Textual Explanations of Neurons.”

11.12.2024 06:43 β€” πŸ‘ 10    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0

@lkopf is following 20 prominent accounts