ElinorπŸŽ—οΈ @ COLM 🍁's Avatar

ElinorπŸŽ—οΈ @ COLM 🍁

@elinorpd.bsky.social

MIT // researching fairness, equity, & pluralistic alignment in LLMs previously @ MIT media lab, mila / mcgill i like language and dogs and plants and ultimate frisbee and baking and sunsets https://elinorp-d.github.io

1,283 Followers  |  404 Following  |  178 Posts  |  Joined: 13.11.2024  |  2.1055

Latest posts by elinorpd.bsky.social on Bluesky

Evaluation reasoning interpretability rl in context benchmark alignment synthetic data

Evaluation reasoning interpretability rl in context benchmark alignment synthetic data

COLM word cloud. Yoav says it’s the year of reasoning, but evaluation is also huge.

07.10.2025 12:55 β€” πŸ‘ 25    πŸ” 5    πŸ’¬ 3    πŸ“Œ 0

bsky.app/profile/elin...

07.10.2025 02:44 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

I’m at #COLM2025! Would love to chat about anything related to pluralistic alignment, fairness evaluations, societal impacts of LLMs, etc 😊

You can also find me at the NLP4Democracy workshop giving a talk about my work analyzing democratic deliberation with LLMs Oct 10th!

07.10.2025 02:43 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Alright the evening sky, you’re utterly wondrous and fantastical, we get it, geez

06.10.2025 18:35 β€” πŸ‘ 220    πŸ” 11    πŸ’¬ 4    πŸ“Œ 0

Here’s a #COLM2025 feed!

Pin it πŸ“Œ to follow along with the conference this week!

06.10.2025 20:26 β€” πŸ‘ 22    πŸ” 15    πŸ’¬ 2    πŸ“Œ 1
Preview
NLP 4 Democracy - COLM 2025

I will be at #COLM2025 this week, and would love to connect with folks interested in applications (and critiques) of language modeling in social science research!

And join us for the NLP4Democracy workshop on Friday!

sites.google.com/andrew.cmu.e...

#NLP #NLProc #LLM #ComputationalSocialScience

06.10.2025 19:31 β€” πŸ‘ 13    πŸ” 4    πŸ’¬ 0    πŸ“Œ 0
Codebook LLMs: Evaluating LLMs as Measurement Tools for Political Science Concepts | Political Analysis | Cambridge Core Codebook LLMs: Evaluating LLMs as Measurement Tools for Political Science Concepts

Very excited that my paper with @katakeith.bsky.social is now out in @polanalysis.bsky.social. We investigate whether LLMs actually follow the instructions/definitions provided in codebooks, propose some diagnostics, and release a new evaluation dataset.
www.cambridge.org/core/journal...

19.09.2025 13:45 β€” πŸ‘ 31    πŸ” 14    πŸ’¬ 0    πŸ“Œ 2

I wish students understood in most empirical AI research there’s a huge scientific advantage from being constitutionally excited by math vs intimidated, but very little additional gain from being actually β€œgood” at math. Maybe they’d be less intimidated if they didn’t feel they had to be β€œgood”.

19.09.2025 19:00 β€” πŸ‘ 51    πŸ” 4    πŸ’¬ 7    πŸ“Œ 1
Preview
An AI-Powered Framework for Analyzing Collective Idea Evolution in Deliberative Assemblies In an era of increasing societal fragmentation, political polarization, and erosion of public trust in institutions, representative deliberative assemblies are emerging as a promising democratic forum...

This work is part of my master’s thesis at @mit.edu @medialab.bsky.social, supervised by Deb Roy and with the help of Jad Kabbara @jad-kabbara.bsky.social.

πŸ”— arxiv.org/abs/2509.12577

17.09.2025 17:40 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Beyond research, this paves the way for:
✨ Tools supporting live assemblies in real time
✨ Increasing transparency & communicating critical insights to decision-makers
✨ Enabling richer cross-assembly analysis to advance research on deliberative best practices

17.09.2025 17:40 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

In the tech-enhanced assembly, our framework revealed:
πŸ”Ή How deliberation surfaced, refined, or discarded ideas
πŸ”Ή *Missing* viable ideas
πŸ”Ή How opinion shifts & rec edits shaped outcomes
πŸ”Ή Underlying values & trade-offs invisible to decision-makers

17.09.2025 17:40 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

We develop an LLM-based framework to:
βœ… Map how suggestions transform into concrete recommendations
βœ… Reconstruct individuals’ evolving perspectives
βœ… Detect why votes shift across deliberation

17.09.2025 17:40 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Loading...

Despite their promise, we still lack tools to empirically trace:
β€’ how ideas evolve into recommendations
β€’ how deliberation shapes perspectives & votes

At MIT CCC, we hosted our own tech-enhanced assembly to explore how AI can help!

sustainabilityassembly.portal.cortico.ai

17.09.2025 17:40 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Deliberative assemblies bring together everyday citizens selected by lottery. Through deliberation πŸ’¬ & learning, they collectively form policy recommendations πŸ’‘for decision-makers.

They’ve proven successful worldwide, facilitating rebuilding trust & strengthening democracy 🀝.

17.09.2025 17:40 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
An AI-Powered Framework for Analyzing Collective Idea Evolution in Deliberative Assemblies In an era of increasing societal fragmentation, political polarization, and erosion of public trust in institutions, representative deliberative assemblies are emerging as a promising democratic forum...

🚨 New preprint! 🚨
Excited to share my work: An AI-Powered Framework for Analyzing Collective Idea Evolution in Deliberative Assemblies πŸ€–πŸ—³οΈ

I’ll be presenting this at @colmweb.org in the NLP4Democracy workshop!

πŸ”— arxiv.org/abs/2509.12577

17.09.2025 17:40 β€” πŸ‘ 6    πŸ” 1    πŸ’¬ 1    πŸ“Œ 1

"This suggests that LLM benchmark behavior may generalize less and less to non-benchmark settings, raising new concerns about ecological validity."

super interesting

15.09.2025 17:25 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
We must build AI for people; not to be a person

This paper yields the same conclusion as what @mustafasuleymanai.bsky.social recently posted on the danger of 'seemingly conscious AI'.

mustafa-suleyman.ai/seemingly-co...

15.09.2025 15:03 β€” πŸ‘ 15    πŸ” 4    πŸ’¬ 0    πŸ“Œ 1

Thread inspired by having to review 6(!!) papers for AAAI and most of them having no line numbers. And one particularly great paper I want to show the authors exactly how much I enjoyed it via my annotation drawings (>20 check marks, ~10 exclamations, and even 2 hearts!)

14.09.2025 19:57 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Typing the (sometimes extreme) number of typo corrections is tedious, time consuming, and especially frustrating when there’s no line numbers on the pdf! It would honestly be faster for me to edit it myself πŸ™„

14.09.2025 19:53 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

When reading papers, especially reviewing, I like to print and annotate as I read. I wish I could upload this to open review so authors can see smaller suggestions (typos, formatting errors) as well as smaller positive notes eg things I appreciated or found useful/interesting

14.09.2025 19:53 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

!!!!! me too

14.09.2025 19:49 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

i'm curious to hear your thoughts on their piloting of AI assisted reviewing: bsky.app/profile/aaai...

14.09.2025 17:36 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
AAAI-26 Review Process Update: Scale, Integrity Measures, and Pathways to Sustainability - AAAI

totally agree! the peer review system is already over burdened and there needs to be an intermediate step for AI generated work. for example, AAAI received *double* the amount of normal submissions this year even after desk rejections aaai.org/conference/a....

14.09.2025 17:36 β€” πŸ‘ 8    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0

First crochet project done! Super proud and excited for the next one

13.09.2025 07:31 β€” πŸ‘ 7    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Center for the Alignment of AI Alignment Centers We align the aligners

This new center strikes the right tone in approaching the AI alignment problem. alignmentalignment.ai

11.09.2025 20:47 β€” πŸ‘ 58    πŸ” 14    πŸ’¬ 4    πŸ“Œ 4

Yeah absolutely. I wasn’t sure how much of the modelling work you were aiming for, but thought to share just in case :))
It would be interesting also to check if cultural alignment work fully β€œaligns” w the goals of cultural analytics researchers too

08.09.2025 21:06 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Benchmarking Vision Language Models for Cultural Understanding Foundation models and vision-language pre-training have notably advanced Vision Language Models (VLMs), enabling multimodal processing of visual and linguistic data. However, their performance has bee...

maybe this one is relevant: "Benchmarking Vision Language Models for Cultural Understanding"

arxiv.org/abs/2407.10920

08.09.2025 17:25 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image Post image

It never ceases to amaze me that some folks can just continue on with their evenings and not stop everything to gaze at the magnificence of the post-rain sunset (and ofc take a billion photos)!

08.09.2025 05:11 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

This is a cool paper that suggests that AI agents can indeed be used for social science experiments, but that just using a chatbot isn't good enough, instead prompts developed based on social & game theory makes AI agent actions predictive of real human outcomes. benjaminmanning.io/files/optimi...

04.09.2025 02:13 β€” πŸ‘ 73    πŸ” 9    πŸ’¬ 2    πŸ“Œ 6

im literally doing this rn, with the submission deadline being tomorrow πŸ˜…

03.09.2025 21:24 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 2    πŸ“Œ 0

@elinorpd is following 20 prominent accounts