Valentina Pyatkin's Avatar

Valentina Pyatkin

@valentinapy.bsky.social

Postdoc in AI at the Allen Institute for AI & the University of Washington. 🌐 https://valentinapy.github.io

5,669 Followers  |  568 Following  |  74 Posts  |  Joined: 08.09.2023  |  2.1312

Latest posts by valentinapy.bsky.social on Bluesky

Now accepted to #neurips25 datasets & benchmarks!
See you in San Diego! πŸ₯³

20.09.2025 06:56 β€” πŸ‘ 8    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image Post image

πŸš€ Can open science beat closed AI? TΓΌlu 3 makes a powerful case. In our new #WiAIRpodcast, we speak with Valentina Pyatkin (@valentinapy.bsky.social) of @ai2.bsky.social and the University of Washington about a fully open post-training recipeβ€”models, data, code, evals, and infra. #WomenInAI 1/8🧡

19.09.2025 16:13 β€” πŸ‘ 5    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0
Post image

"π‹π‹πŒ 𝐏𝐨𝐬𝐭-𝐭𝐫𝐚𝐒𝐧𝐒𝐧𝐠: 𝐎𝐩𝐞𝐧 π’πœπ’πžπ§πœπž π“π‘πšπ­ 𝐏𝐨𝐰𝐞𝐫𝐬 𝐏𝐫𝐨𝐠𝐫𝐞𝐬𝐬 " πŸŽ™οΈ

On Sept 17, the #WiAIRpodcast speaks with @valentinapy.bsky.social (@ai2.bsky.social & University of Washington) about open science, post-training, mentorship, and visibility

#WiAIR #NLProc

12.09.2025 15:00 β€” πŸ‘ 6    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0
Post image

With fresh support of $75M from NSF and $77M from NVIDIA, we’re set to scale our open model ecosystem, bolster the infrastructure behind it, and fast‑track reproducible AI research to unlock the next wave of scientific discovery. πŸ’‘

14.08.2025 12:16 β€” πŸ‘ 45    πŸ” 7    πŸ’¬ 1    πŸ“Œ 6
Post image

On my way to Oxford: Looking forward to speaking at OxML 2025

10.08.2025 08:09 β€” πŸ‘ 8    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Opinion Abstracts: SoLaR Workshop @ COLM 2025 Beyond the two main tracks we are inviting short opinion abstracts (500 words maximum) on new perspectives for what socially responsible language modeling research might look like. The SoLaR organizi...

The submission deadline is August 26 2025 (AoE time), and decisions will be sent out on September 2, 2025.

Submit your abstracts here:
docs.google.com/forms/d/e/1F...

08.08.2025 12:40 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

πŸ”ˆFor the SoLaR workshop
@COLM_conf
we are soliciting opinion abstracts to encourage new perspectives and opinions on responsible language modeling, 1-2 of which will be selected to be presented at the workshop.

Please use the google form below to submit your opinion abstract ⬇️

08.08.2025 12:40 β€” πŸ‘ 8    πŸ” 4    πŸ’¬ 1    πŸ“Œ 0

I had a lot of fun contemplating about memorization questions at the @l2m2workshop.bsky.social panel yesterday together with Niloofar Mireshghallah and Reza Shokri, moderated by
@pietrolesci.bsky.social who did a fantastic job!
#ACL2025

02.08.2025 15:04 β€” πŸ‘ 12    πŸ” 2    πŸ’¬ 1    πŸ“Œ 1

I'll be at #ACL2025πŸ‡¦πŸ‡Ή!!
Would love to chat about all things pragmatics 🧠, redefining "helpfulness"πŸ€” and enabling better cross-cultural capabilities πŸ—ΊοΈ 🫢

Presenting our work on culturally offensive nonverbal gestures πŸ‘‡
πŸ•›Wed @ Poster Session 4
πŸ“Hall 4/5, 11:00-12:30

26.07.2025 02:46 β€” πŸ‘ 4    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

I did! very very good!!

19.07.2025 05:19 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

πŸ”₯tokenization panel!

18.07.2025 22:45 β€” πŸ‘ 7    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

why is vancouver sushi so good? 🀀 (vancouver food in general actually)

18.07.2025 20:27 β€” πŸ‘ 9    πŸ” 0    πŸ’¬ 3    πŸ“Œ 0
Post image Post image

This week is #ICML in Vancouver, and a number of our researchers are participating. Here's the full list of Ai2's conference engagementsβ€”we look forward to connecting with fellow attendees. πŸ‘‹

14.07.2025 19:30 β€” πŸ‘ 3    πŸ” 2    πŸ’¬ 0    πŸ“Œ 0
Preview
ICML chat | Valentina Pyatkin | Cal.com ICML chat

book a slot for a chat on my cal:

cal.com/valentinap/i...

11.07.2025 15:44 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Let me know if you want to meet up! Always happy to chat!

11.07.2025 14:09 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
ICML Poster Diverging Preferences: When do Annotators Disagree and do Models Know?ICML 2025

07/17, Poster: Diverging Preferences: When do Annotators Disagree and do Models Know? icml.cc/virtual/2025...

07/16, Poster: SafetyAnalyst: Interpretable, transparent, and steerable safety moderation for AI behavior
icml.cc/virtual/2025...

11.07.2025 14:09 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

I'll be at ICML in Vancouver next week! #ICML2025
You can find me at the following:

- giving an invited talk at the "Models of Human Feedback for AI Alignment" workshop

- giving an invited talk at the "AI for Math" workshop

I'll also present these two papers ‡️

11.07.2025 14:09 β€” πŸ‘ 9    πŸ” 2    πŸ’¬ 1    πŸ“Œ 0

In GenevaπŸ‡¨πŸ‡­to attend the International Open-Source LLM Builders Summit and present OLMo and TΓΌlu!

06.07.2025 17:23 β€” πŸ‘ 10    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

And I can't forget to thank my amazing co-authors! In particular @saumyamalik.bsky.social and Victoria Graf, with whom I looked through so many constraints πŸ˜„
And @natolambert.bsky.social @hanna-nlp.bsky.social @hamishivi.bsky.social @pdasigi.bsky.social @vwxyzjn.bsky.social

03.07.2025 21:06 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

We further discuss what happens when you over-optimize on IF-RLVR: the models tend to prioritize the constraint over the actual instruction! And we suggest possible solutions to this problem.

πŸ“ Paper: buff.ly/1qSA9Pq
πŸ’» Code: github.com/allenai/IFBe...

03.07.2025 21:06 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Additionally, we wrote new training constraints and verifier functions and suggest a good recipe for IF-RLVR training for improved generalization.
We find that IF-RLVR generalization works best on base models and when you train on multiple constraints per instruction!

03.07.2025 21:06 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

πŸ’‘Beyond math/code, instruction following with verifiable constraints is suitable to be learned with RLVR.
But the set of constraints and verifier functions is limited and most models overfit on IFEval.
We introduce IFBench to measure model generalization to unseen constraints.

03.07.2025 21:06 β€” πŸ‘ 29    πŸ” 5    πŸ’¬ 1    πŸ“Œ 1

plus, some fun RL experiments

03.07.2025 18:14 β€” πŸ‘ 4    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0

This new benchmark created by @valentinapy.bsky.social should be the new default replacing IFEval. Some of the best frontier models get <50% and it comes with separate training prompts so people don’t effectively train on test.

Wild gap from o3 > Gemini 2.5 pro of like 30 points.

03.07.2025 18:14 β€” πŸ‘ 18    πŸ” 3    πŸ’¬ 2    πŸ“Œ 0
Post image

Introducing IFBench, a benchmark to measure how well AI models follow new, challenging, and diverse verifiable instructions. Top models like Gemini 2.5 Pro or Claude 4 Sonnet are only able to score up to 50%, presenting an open frontier for post-training. 🧡

03.07.2025 18:01 β€” πŸ‘ 18    πŸ” 1    πŸ’¬ 1    πŸ“Œ 3

Check out our take on Chain-of-Thought.
I really like this paper as a survey on the current literature on what CoT is, but more importantly on what it's not.
It also serves as a cautionary tale to the (apparently quite common) misuse of CoT as an interpretable method.

01.07.2025 17:45 β€” πŸ‘ 14    πŸ” 4    πŸ’¬ 1    πŸ“Œ 1

🚨Submission deadline extended to June 27th AoE!🚨

Our reviewer interest form is also open!

See below for more detailsπŸ‘‡

24.06.2025 18:02 β€” πŸ‘ 5    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0
Preview
Third Workshop on Socially Responsible Language Modelling Research (SoLaR) 2025 COLM 2025 in-person Workshop, October 10th at the Palais des Congrès in Montreal, Canada

Submit your paper or sign up to review by June 23, 2025

➑️CFP and workshop info: solar-colm.github.io

➑️Reviewer sign up: docs.google.com/forms/d/e/1F...

17.06.2025 17:46 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

Interested in shaping the progress of responsible AI and meeting leading researchers in the field? SoLaR@COLM 2025 is looking for paper submissions and reviewers!

πŸ€– ML track: algorithms, math, computation
πŸ“š Socio-technical track: policy, ethics, human participant research

17.06.2025 17:46 β€” πŸ‘ 8    πŸ” 1    πŸ’¬ 1    πŸ“Œ 1

Congrats again!!!

14.06.2025 00:01 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

@valentinapy is following 20 prominent accounts