Hongli Zhan ✈️ ICML's Avatar

Hongli Zhan ✈️ ICML

@hongli-zhan.bsky.social

http://honglizhan.github.io PhD Candidate 🀘@UTAustin | previously @IBMResearch @sjtu1896 | NLP for social good

108 Followers  |  241 Following  |  13 Posts  |  Joined: 19.11.2024  |  1.826

Latest posts by hongli-zhan.bsky.social on Bluesky

ICML Expo Talk Panel Situating principles in context for synthetic dataICML 2025

I will also be on IBM's Expo Talk Panel on Monday, Jul 14, to discuss how SPRI can be incorporated in the industry for generating high-quality synthetic data. You can find more details of the talk here: icml.cc/virtual/2025...

08.07.2025 15:13 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
ICML Poster SPRI: Aligning Large Language Models with Context-Situated PrinciplesICML 2025

πŸ“œLink to the paper: icml.cc/virtual/2025...
πŸ‘¨πŸ»β€πŸ’»Code and data: github.com/honglizhan/S...

Shout out to an amazing team @jessyjli.bsky.social, @m-yurochkin.bsky.social, Muneeza Azmat & Raya Horesh! Also super grateful to the reviewers for their invaluable feedback!

#ICML2025 #LLMAlignment

08.07.2025 15:13 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

1️⃣SPRI generates principles as effective as psychologists to improve users’ well-being

2️⃣SPRI enables tailored rubrics for LLM-judges, matching human-crafted rubrics (e.g., BiGGen-Bench)

3️⃣SPRI-generated synthetic data boosts Llama/Mistral/Gemma (7~9B) on TruthfulQA, with no loss on other benchmarks

08.07.2025 15:11 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

🎯Motivation: Constitutional AI works great for aligning LLMs, but the principles can be too generic to apply. Can we guide responses with context-situated principles instead?

πŸ’‘SPRI tackles this and proves to rival human oracle guidance in the three real-world use cases we tested on πŸ‘‡

08.07.2025 15:06 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

I'll be at #ICML to present SPRI next week! Come by our poster on Tuesday, July 15, 4:30pm, and let’s catch up on LLM alignment! πŸ˜ƒ

πŸš€TL;DR: We introduce Situated-PRInciples (SPRI), a framework that automatically generates input-specific principles to align responses β€” with minimal human effort.

🧡

08.07.2025 15:05 β€” πŸ‘ 3    πŸ” 1    πŸ’¬ 1    πŸ“Œ 1

🎯Motivation: Constitutional AI works great for aligning LLMs, but the principles can be too generic to apply. Can we guide responses with context-situated principles instead?

πŸ’‘SPRI tackles this and proves to rival human oracle guidance in the three real-world use cases we tested on πŸ‘‡

08.07.2025 15:03 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

I’m excited to share that our paper has been accepted at #ICML2025! πŸŽ‰πŸ₯³πŸŽŠ

This work was done during my internship at IBM Research, and it wouldn’t have been possible without a top-notch team and my amazing advisor πŸ‘

02.05.2025 21:27 β€” πŸ‘ 4    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

To appear #ICML2025!! πŸŽ‰

02.05.2025 12:34 β€” πŸ‘ 4    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

I definitely agree :) I think SPRI can help generate SFT data for their constitutional classifiers that extrapolate *beyond* the "chemical weapons" context that they show in Sec 5 and Appendix B.

Thanks for sharing this!

07.02.2025 05:20 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

The principles that LLMs align with should be specific to the task at hand! Check out @hongli-zhan.bsky.social’s latest work πŸ‘‡

06.02.2025 22:52 β€” πŸ‘ 4    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

[5/5] Code and model generations: github.com/honglizhan/S...

This project was carried out during my internship at IBM Research, and I’d like to highlight the support and mentorship from my amazing hosts Muneeza Azmat, Raya Horesh, @m-yurochkin.bsky.social and advisor @jessyjli.bsky.social!

06.02.2025 22:48 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

[4/5] In addition, when applying SPRI to generate SFT data for alignment, we observe substantial improvement on TruthfulQA.

06.02.2025 22:44 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image Post image

[3/5] We tested SPRI on 3 tasks: generating 1) cognitive reappraisals, 2) instance-specific rubrics for LLM-as-a-judge, and 3) SFT data for alignment.

SPRI turns out to work great for tasks that require complex principles, showcasing on-par performance as expert-guided methods.

06.02.2025 22:44 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

[2/5] Short for Situated-PRInciples, SPRI involves 2 stages: 1) synthesizing context-situated principles, and 2) crafting principle-guided responses.

In each stage, a base model and a critic model are used to create principles and responses from scratch through critique-refine.

06.02.2025 22:44 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Constitutional AI works great for aligning LLMs, but the principles can be too generic to apply.

Can we guide responses with context-situated principles instead?

Introducing SPRI, a system that produces principles tailored to each query, with minimal to no human effort.

arxiv.org/pdf/2502.03397

06.02.2025 22:43 β€” πŸ‘ 4    πŸ” 2    πŸ’¬ 2    πŸ“Œ 3

@hongli-zhan is following 20 prominent accounts