Andy Liu

Andy Liu

@andyliu.bsky.social

phd type things @ cmu lti andyjliu.github.io

1,960 Followers 658 Following 16 Posts Joined Aug 2023
1 month ago

👋

1 0 0 0
1 month ago
Preview
CMU LTI Summer 2026 Internship Program Application We are looking for applicants for the Carnegie Mellon University Language Technology Institute's Summer 2026 "Language Technology for All" internship program. The main goal of this internship is to pr...

🚀 Apply to CMU LTI’s Summer 2026 “Language Technology for All” internship! 🎓 Open to pre‑doctoral students new to language tech (non‑CS backgrounds welcome). 🔬 12–14 weeks in‑person in Pittsburgh — travel + stipend paid. 💸 Deadline: Feb 20, 11:59pm ET. Apply → forms.gle/cUu8g6wb27Hs...

14 12 2 0
5 months ago
Post image

🚨New paper: Reward Models (RMs) are used to align LLMs, but can they be steered toward user-specific value/style preferences?
With EVALUESTEER, we find even the best RMs we tested exhibit their own value/style biases, and are unable to align with a user >25% of the time. 🧵

12 7 1 0
5 months ago

Thanks to my collaborators @kghate.bsky.social @monadiab77.bsky.social @daniel-fried.bsky.social @atoosakz.bsky.social @maxkw.bsky.social
for their support in making this work possible!

1 0 0 0
5 months ago

Please reach out if you'd like to chat about this work! We hope ConflictScope helps researchers study how models handle value conflicts that matter to their communities.
Code and data: github.com/andyjliu/con...
Arxiv: www.arxiv.org/abs/2509.25369

3 0 1 0
5 months ago
Post image

ConflictScope can also be used to evaluate different approaches toward steering models. We find that including detailed target rankings in system prompts consistently improves model alignment with the target ranking while under conflict, but with plenty of room for improvement.

1 0 1 0
5 months ago
Post image

We find significant shifts between models’ expressed and revealed preferences under conflict! Models say they prefer actions that support protective values (e.g. harmlessness) when asked directly, but support personal values (e.g. helpfulness) in more realistic evaluations.

1 0 1 0
5 months ago

To address issues with multiple-choice evaluation, we focus on open-ended evaluation with a simulated user. Annotation studies show strong correlation between LLM and human judgments of which action a model took in a given scenario, allowing us to automate open-ended evaluations.

0 0 0 0
5 months ago
Post image

We introduce new metrics to measure how morally challenging a dataset is for models. We find that ConflictScope produces datasets that elicit more disagreement and stronger preferences than moral dilemma datasets, while alignment data frequently elicits indifference from models.

1 0 2 0
5 months ago
Post image

Given a set of values, ConflictScope generates scenarios in which an LLM-based assistant faces a conflict between a pair of values in the set. It then evaluates which value a target LLM supports more in each scenario before combining scenario-level judgments into a value ranking.

1 0 1 0
5 months ago
Post image

🚨New Paper: LLM developers aim to align models with values like helpfulness or harmlessness. But when these conflict, which values do models choose to support? We introduce ConflictScope, a fully-automated evaluation pipeline that reveals how models rank values under conflict.
(📷 xkcd)

15 4 1 3
8 months ago

Placing LLMs in simulated markets helps us quantitatively and qualitatively measure their propensity to collude, as well as how environmental changes affect this. Read below or find @veronateo.bsky.social at the ICML multi-agent systems workshop to learn more!

5 0 0 0
1 year ago

very cool!

1 0 0 0
1 year ago
CMU LTI Language Technology for All Internship 2025 - Language Technologies Institute - School of Computer Science - Carnegie Mellon University The LTI is currently seeking applicants for the summer 2025 Language Technology for All Internship

CMU LTI is hosting predoc interns this summer, centered around "Language Technologies for All"! Please apply and circulate! lti.cs.cmu.edu/news-and-eve...

19 8 1 0
1 year ago

these are great, thanks! will check them out

0 0 0 0
1 year ago

started Axiomatic but didn’t get very far - Permutation City looks fun though, thanks

0 0 1 0
1 year ago

looking for 2025 book recs!

things i've previously liked, for reference -
nonfiction: the structure of scientific revolutions, cybernetic revolutionaries, seeing like a state
fiction: stories of your life and others, one hundred years of solitude, project hail mary, recursion

1 1 5 0
1 year ago

PRISM has preference scores for different models that you can convert into pairwise labels

2 0 1 0
1 year ago

Looking for all your LTI friends on Bluesky? The LTI Starter Pack is here to help!

go.bsky.app/NhTwCVb

15 9 6 1
1 year ago

could I be added? thanks for curating :)

1 0 0 0