Sarah Wiegreffe's Avatar

Sarah Wiegreffe

@sarah-nlp.bsky.social

Research in NLP (mostly LM interpretability & explainability). Assistant prof at UMD CS + CLIP. Previously @ai2.bsky.social @uwnlp.bsky.social Views my own. sarahwie.github.io

1,234 Followers  |  205 Following  |  30 Posts  |  Joined: 18.11.2024  |  2.1519

Latest posts by sarah-nlp.bsky.social on Bluesky

Post image

If you're at #ICML2025, chat with me, @sarah-nlp.bsky.social, Atticus, and others at our poster 11am - 1:30pm at East #1205! We're establishing a ๐— echanistic ๐—œnterpretability ๐—•enchmark.

We're planning to keep this a living benchmark; come by and share your ideas/hot takes!

17.07.2025 17:45 โ€” ๐Ÿ‘ 13    ๐Ÿ” 3    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

I am also recruiting PhD students @univofmaryland.bsky.social for fall 2026 with interests in (causal/mechanistic) LM interpretability and its practical applications (steering, efficient adaptation, model editing, textual explanations for users, etc.).

16.07.2025 23:09 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

I am at #ICML2025! ๐Ÿ‡จ๐Ÿ‡ฆ๐Ÿž๏ธ
Catch me:

1๏ธโƒฃ Presenting this paper๐Ÿ‘‡ tomorrow 11am-1:30pm at East #1205

2๏ธโƒฃ At the Actionable Interpretability @actinterp.bsky.social workshop on Saturday in East Ballroom A (Iโ€™m an organizer!)

16.07.2025 23:09 โ€” ๐Ÿ‘ 3    ๐Ÿ” 1    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image Post image

This week is #ICML in Vancouver, and a number of our researchers are participating. Here's the full list of Ai2's conference engagementsโ€”we look forward to connecting with fellow attendees. ๐Ÿ‘‹

14.07.2025 19:30 โ€” ๐Ÿ‘ 3    ๐Ÿ” 2    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Thank you! Look forward to being colleagues.

26.06.2025 07:24 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Thank you!

26.06.2025 07:24 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Thank you!

26.06.2025 07:24 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Thanks :))

26.06.2025 07:24 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Thanks so much for all your support โ˜บ๏ธ๐Ÿฅฐ

16.06.2025 22:11 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Thank you!

16.06.2025 04:11 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Thank you ๐Ÿ˜„

16.06.2025 04:11 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

โ˜บ๏ธ come visit!

16.06.2025 04:11 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Post image

A bit late to announce, but Iโ€™m excited to share that I'll be starting as an assistant professor at UMD CS @univofmaryland.bsky.social this August.

I'll be recruiting PhD students this upcoming cycle for fall 2026. (And if you're a UMD grad student, sign up for my fall seminar!)

13.06.2025 18:20 โ€” ๐Ÿ‘ 65    ๐Ÿ” 3    ๐Ÿ’ฌ 13    ๐Ÿ“Œ 1

Congrats Kristina! ๐Ÿ˜

30.05.2025 18:11 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
An image with the Vancouver skyline and the words "sign up to review". At the top are the logos of both the Actionable Interpretability workshop (a magnifying glass) and the ICML conference (a brain).

An image with the Vancouver skyline and the words "sign up to review". At the top are the logos of both the Actionable Interpretability workshop (a magnifying glass) and the ICML conference (a brain).

๐Ÿšจ We're looking for more reviewers for the workshop!
๐Ÿ“† Review period: May 24-June 7

If you're passionate about making interpretability useful and want to help shape the conversation, we'd love your input.

๐Ÿ’ก๐Ÿ” Self-nominate here:
docs.google.com/forms/d/e/1F...

20.05.2025 00:05 โ€” ๐Ÿ‘ 6    ๐Ÿ” 5    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

๐Ÿค–: "Great review, but it could be improved by doing [exact thing I wrote in subsequent sentences]"

25.04.2025 02:37 โ€” ๐Ÿ‘ 3    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Where is version control and shared editing for keynote files?! ๐Ÿคฆโ€โ™€๏ธ

25.04.2025 02:36 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

We are quite excited about the leaderboard and release, and are open to feedback to help this remain a living benchmark.

25.04.2025 02:24 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Checkout our new preprint/project which has been over a year in the making! This has been a very fun collaboration (and one of the biggest I've personally participated in).

@amuuueller.bsky.social @boknilev.bsky.social and other co-authors are around #ICLR2025 if you want to find out more. ๐Ÿ˜Š

25.04.2025 02:24 โ€” ๐Ÿ‘ 9    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

See Yanai's thread for more info:
bsky.app/profile/yana...

25.04.2025 02:21 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Post image

2) On the connection between linear relational embeddings in LMs and frequency of relations in pretraining data
- Led by @jackmerullo.bsky.social w/ @nlpnoah.bsky.social @yanai.bsky.social
- arxiv.org/abs/2504.12459
- Yanai is presenting the poster tomorrow 04/26 10am-12:30pm (Hall 3+Hall 2B #236)!

25.04.2025 02:20 โ€” ๐Ÿ‘ 2    ๐Ÿ” 1    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 1
Post image

I'm not at #ICLR2025, but have 2 works being presented:

1) Understanding how LMs answer multiple-choice questions
- arxiv.org/abs/2407.15018
- @boknilev.bsky.social is presenting the poster *now* until 12:30 (Hall 3+Hall 2B #207)
- & w/ @oyvind-t.bsky.social @hanna-nlp.bsky.social Ashish Sabharwal

25.04.2025 02:18 โ€” ๐Ÿ‘ 6    ๐Ÿ” 1    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 1
Post image

I'm in Singapore for ICLR to present this paper:
Tomorrow, April 26th, 10-12:30 in Hall 3+2B #236
Come check it out!

arxiv.org/abs/2504.12459

25.04.2025 01:55 โ€” ๐Ÿ‘ 3    ๐Ÿ” 2    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Post image

๐Ÿ’ก New ICLR paper! ๐Ÿ’ก
"On Linear Representations and Pretraining Data Frequency in Language Models":

We provide an explanation for when & why linear representations form in large (or small) language models.

Led by @jackmerullo.bsky.social, w/ @nlpnoah.bsky.social & @sarah-nlp.bsky.social

25.04.2025 01:55 โ€” ๐Ÿ‘ 42    ๐Ÿ” 12    ๐Ÿ’ฌ 3    ๐Ÿ“Œ 3

Have work on the actionable impact of interpretability findings? Consider submitting to our Actionable Interpretability workshop at ICML! See below for more info.

Website: actionable-interpretability.github.io
Deadline: May 9

03.04.2025 17:58 โ€” ๐Ÿ‘ 20    ๐Ÿ” 10    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Post image

๐Ÿ“ข Open PhD Position in Interpretable Natural Language Processing at the Department of Computer Science, UCPH!

๐Ÿ—“ Application deadline is 15 January 2025.

Find more information about the position and apply here ๐Ÿ‘‰ di.ku.dk/english/abou...

@apepa.bsky.social @iaugenstein.bsky.social

08.01.2025 10:43 โ€” ๐Ÿ‘ 9    ๐Ÿ” 2    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

๐Ÿคฉ

11.12.2024 21:17 โ€” ๐Ÿ‘ 4    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Post image

Enjoying my FOMO coffee this morning

08.12.2024 18:39 โ€” ๐Ÿ‘ 36    ๐Ÿ” 0    ๐Ÿ’ฌ 3    ๐Ÿ“Œ 0
Preview
Definitely Maybe (30th Anniversary Deluxe Edition) Oasis ยท Album ยท 2024 ยท 27 songs

V old-school, but Oasis (album: Definitely Maybe)
open.spotify.com/album/4ltq9C...

26.11.2024 19:11 โ€” ๐Ÿ‘ 3    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Ok one thing I sorely need here is bookmarks

26.11.2024 08:06 โ€” ๐Ÿ‘ 21    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

@sarah-nlp is following 20 prominent accounts