Afra Amini @afraamn - Bluesky Profile

Latest posts by afraamn.bsky.social on Bluesky

✨Be sure to check out our paper for a detailed discussion of variance reduction techniques applied to KL divergence estimation between language models!

06.05.2025 14:59 — 👍 0 🔁 0 💬 0 📌 0

Finally, we plot the reward–KL Pareto frontier across various KL regularization settings. We find that the RB estimator more effectively constrains the KL divergence, and models trained with it appear significantly more often on the Pareto front:

06.05.2025 14:59 — 👍 0 🔁 0 💬 1 📌 0

In RLHF training, using our RB estimator yields more stable runs compared to the MC estimator. It achieves high rewards while reliably preventing the KL divergence from increasing beyond an acceptable range:

06.05.2025 14:59 — 👍 0 🔁 0 💬 1 📌 0

Notably, the widely used CV(α=1) estimator—also known as the k3 estimator—can suffer from very high variance. It's a special case of control variates, a classic variance reduction method that requires proper choice of α; otherwise, as with CV(α=1), it can increase variance

06.05.2025 14:59 — 👍 0 🔁 0 💬 1 📌 0

When evaluating the KL divergence between the language model before and after preference alignment, our estimator (RB) consistently yields lower standard deviation across all prompts compared to every other estimator available in public RLHF libraries:

06.05.2025 14:59 — 👍 0 🔁 0 💬 1 📌 0

All it took was applying Rao–Blackwellization—a classic variance reduction trick—to the Monte Carlo (MC) estimator, and carefully adapting it for LMs. The result is simple: condition on prefixes and replace the MC estimate with its conditional expectation:

06.05.2025 14:59 — 👍 0 🔁 0 💬 1 📌 0

Current KL estimation practices in RLHF can generate high variance and even negative values! We propose a provably better estimator that only takes a few lines of code to implement.🧵👇
w/ @xtimv.bsky.social and Ryan Cotterell
code: arxiv.org/pdf/2504.10637
paper: github.com/rycolab/kl-rb

06.05.2025 14:59 — 👍 7 🔁 3 💬 1 📌 0

@afraamn is following 20 prominent accounts

Tim Vieira
@xtimv

http://timvieira.github.io/blog

Maria Antoniak
@mariaa

asst prof of computer science at cu boulder nlp, cultural analytics, narratives, communities books, bikes, games, art https://maria-antoniak.github.io

Emily M. Bender
@emilymbender

Book: https://thecon.ai Web: https://faculty.washington.edu/ebender

Christopher Manning
@chrmanning

Stanford Linguistics and Computer Science. Director, Stanford AI Lab. Founder of @stanfordnlp.bsky.social . #NLP https://nlp.stanford.edu/~manning/

Margaret Mitchell
@mmitchell

Researcher trying to shape AI towards positive outcomes. ML & Ethics +birds. Generally trying to do the right thing. TIME 100 | TED speaker | Senate testimony provider | Navigating public life as a recluse. Former: Google, Microsoft; Current: Hugging Face

David Bamman
@dbamman

Associate Professor, School of Information, UC Berkeley. NLP, computational social science, digital humanities.

David Smith
@dasmiq

Associate professor of computer science at Northeastern University. Natural language processing, digital humanities, OCR, computational bibliography, and computational social sciences. Artificial intelligence is an archival science.

David Jurgens
@davidjurgens

Associate prof at @UMich in SI and CSE working in computational social science and natural language processing. PI of the Blablablab blablablab.si.umich.edu

David Mimno
@dmimno

He teaches information science at Cornell. http://mimno.infosci.cornell.edu

Luca Soldaini 🎀
@soldaini.net

I like tokens! Lead for OLMo data at @ai2.bsky.social (Dolma 🍇) w @kylelo.bsky.social. Open source is fun 🤖☕️🍕🏳️‍🌈 Opinions are sampled from my own stochastic parrot more at https://soldaini.net

Kyle Lo
@kylelo

language model pretraining @ai2.bsky.social, co-lead of data research w/ @soldaini.net, statistics @uw, open science, tabletop, seattle, he/him,🧋 kyleclo.com

Lucy Lu Wang
@lucylw

Asst Prof @uwischool.bsky.social; #NLP #healthinformatics #accessibility #scholcomm 🚴🏔️🍄❄️⛷️🧶⚫️⚪️📚🍸in Seattle; llwang.net; she/her

Prithviraj "Raj" Ammanabrolu
@rajammanabrolu

AI, RL, NLP, Games Asst Prof at UCSD Research Scientist at Nvidia Lab: http://pearls.ucsd.edu Personal: prithvirajva.com

Jack Hessel
@jmhessel

jmhessel.com @Anthropic. Seattle bike lane enjoyer. Opinions my own.

karpathy
@karpathy

AI @ OpenAI, Tesla, Stanford

Yuval Pinter
@uvp

Karaoke enthusiast 🇮🇱 en/he/him

Pedro Rodriguez
@pedro-rodriguez

Naomi Saphra
@nsaphra

Waiting on a robot body. All opinions are universal and held by both employers and family. ML/NLP professor. nsaphra.net

Lucy Li
@lucy3

Postdoc at UW NLP 🏔️. #NLProc, computational social science, cultural analytics, responsible AI. she/her. Previously at Berkeley, Ai2, MSR, Stanford. Incoming assistant prof at Wisconsin CS. lucy3.github.io

Swabha
@swabhs

Assistant Professor of CS, University of Southern California. NLP / ML.