Tiancheng Hu's Avatar

Tiancheng Hu

@tiancheng.bsky.social

PhD student @CambridgeLTL; Previously @DLAB @EPFL; Interested in NLP and CSS. Apple Scholar, Gates Scholar.

988 Followers  |  1,091 Following  |  36 Posts  |  Joined: 26.09.2023  |  2.0297

Latest posts by tiancheng.bsky.social on Bluesky

SimBench: Benchmarking the Ability of Large
Language Models to Simulate Human Behaviors, SRW Oral, Monday, July 28, 14:00-15:30

26.07.2025 11:21 โ€” ๐Ÿ‘ 3    ๐Ÿ” 1    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 2

I will be presenting:

iNews: A Multimodal Dataset for Modeling Personalized Affective Responses to News, Poster Session 1, Monday, July 28, 11:00-12:30; Also at LAW workshop

26.07.2025 11:21 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

Heading to Vienna today to attend #ACL2025NLP! Let's chat if you are interested in LLM social simulation, personalization, character training and human-centered AI!

26.07.2025 11:21 โ€” ๐Ÿ‘ 3    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Picture of Matthias Orlikowski presenting a poster on the paper titled "Beyond Demographics: Fine-tuning Large Language Models to Predict Individualsโ€™ Subjective Text Perceptions". The poster is similar to the one that will be presented at ACL 2025, showing a number of figures about the key results.

Picture of Matthias Orlikowski presenting a poster on the paper titled "Beyond Demographics: Fine-tuning Large Language Models to Predict Individualsโ€™ Subjective Text Perceptions". The poster is similar to the one that will be presented at ACL 2025, showing a number of figures about the key results.

I will be at #acl2025 to present "Beyond Demographics: Fine-tuning Large Language Models to Predict Individualsโ€™ Subjective Text Perceptions" โœจ

Huge thank you to my collaborators Jiaxin Pei @paul-rottger.bsky.social Philipp Cimiano @davidjurgens.bsky.social @dirkhovy.bsky.social ๐Ÿฐ

more below

20.07.2025 15:23 โ€” ๐Ÿ‘ 23    ๐Ÿ” 2    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 2

Centaur (a model of general cognition tuned from 160 multi-step psych experiment data) nature.com/articles/s41...
@marcelbinz.bsky.social
@ericschulz.bsky.social

09.07.2025 15:44 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

This work complements other fantastic work and data:
Twin-2K-500 (2k individual answering 500+ questions) arxiv.org/abs/2505.17479,
Generative Agent Simulations of 1,000 People (2h interview as seeds for simulation) arxiv.org/abs/2411.10109
@joon-s-pk.bsky.social
@mbernst.bsky.social

09.07.2025 15:44 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

Our unique focus: we're not replicating static profiles (like survey answers). We're simulating a cognitive process - how an individual processes new information and reacts emotionally.

09.07.2025 15:44 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

Working on LLM social simulation and need data?
Excited to announce our iNews paper is accepted to #ACL2025! ๐Ÿฅณ It's a large-scale dataset for predicting individualized affective responses to real-world, multimodal news.

Paper: arxiv.org/abs/2503.03335

Data: huggingface.co/datasets/pit...

09.07.2025 15:44 โ€” ๐Ÿ‘ 10    ๐Ÿ” 1    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
https://www.nature.com/articles/s41586-025-09215-4

Centaur (a model of general cognition tuned from 160 multi-step psych experiment data) t.co/X6IFC29lbx
โ€ช@marcelbinz.bsky.socialโ€ฌ
โ€ช@ericschulz.bsky.socialโ€ฌ

09.07.2025 15:38 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

This work complements other fantastic work and data:
Twin-2K-500 (2k individual answering 500+ questions) arxiv.org/abs/2505.17479,
Generative Agent Simulations of 1,000 People (2h interview as seeds for simulation) arxiv.org/abs/2411.10109
โ€ช@joon-s-pk.bsky.socialโ€ฌ
โ€ช@mbernst.bsky.socialโ€ฌ

09.07.2025 15:38 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

Our unique focus: we're not replicating static profiles (like survey answers). We're simulating a cognitive process - how an individual processes new information and reacts emotionally.

09.07.2025 15:38 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Image illustrating that ALM can enable Ensembling, Transfer to Bytes, and general Cross-Tokenizer Distillation.

Image illustrating that ALM can enable Ensembling, Transfer to Bytes, and general Cross-Tokenizer Distillation.

We created Approximate Likelihood Matching, a principled (and very effective) method for *cross-tokenizer distillation*!

With ALM, you can create ensembles of models from different families, convert existing subword-level models to byte-level and a bunch more๐Ÿงต

02.04.2025 06:36 โ€” ๐Ÿ‘ 26    ๐Ÿ” 14    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

Might be of interest @diyiyang.bsky.social @manoelhortaribeiro.bsky.social @a-lauscher.bsky.social @dirkhovy.bsky.social @joon-s-pk.bsky.social @barbaraplank.bsky.social @maartensap.bsky.social @davidjurgens.bsky.social @dongng.bsky.social

10.03.2025 16:47 โ€” ๐Ÿ‘ 5    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

iNews applications:
โ€ข LLM personalization
โ€ข Affective computing
โ€ข Human behavior simulation
โ€ข Social computing
โ€ข and many more! (8/8)
We are particularly grateful to @camlangsci.bsky.social for funding support and @Kiran Garimella

10.03.2025 16:47 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

Few-Shot:
โ€ข "Early ascent phenomenon": performance dips with few examples, then improves
โ€ข Persona info consistently helps, even at 32-shot (reaching 44.4% accuracy).
โ€ข Image few-shot prompting scales worse than text, despite zero-shot advantage. (7/8)

10.03.2025 16:47 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

Zero-Shot LLM Prediction:
โ€ข Persona info boosts accuracy across models (up to 7% gain!).
โ€ข Image inputs generally outperform text inputs in zero-shot.
โ€ข Gemini 1.5 Pro + image + persona = best zero-shot performance (still only 40% accuracy though). (6/8)

10.03.2025 16:47 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

These persona variables explain up to 15.2% of annotation varianceโ€”more than any existing subjective NLP dataset! Individual differences aren't noiseโ€”they're systematic patterns we can model. (5/8)

10.03.2025 16:47 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

What makes iNews unique? We don't aggregate responses. We capture personal reactions AND collect comprehensive annotator characteristics (i.e. demographics, personality, media habits). (4/8)

10.03.2025 16:47 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

We're introducing iNews: a large-scale dataset capturing the inherent subjectivity of how people respond emotionally to real news content. 2,899 Facebook posts (screenshot so multimodal!) ร— 291 diverse annotators = rich, subjective affective data. (3/8)

10.03.2025 16:47 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

Current AI systems are often trained with the assumption that we all feel the same about content, but psychology shows we don't. Our emotions vary by age, gender, personality, politics & countless other factors. (2/8)

10.03.2025 16:47 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Preview
iNews: A Multimodal Dataset for Modeling Personalized Affective Responses to News Current approaches to emotion detection often overlook the inherent subjectivity of affective experiences, instead relying on aggregated labels that mask individual variations in emotional responses. ...

Ever notice how something that makes your blood boil barely registers with your friend? Our emotional reactions aren't universal at allโ€”they're deeply personal. And AI needs to understand that. Excited to share our new paper: "iNews" ๐Ÿงต (1/8) arxiv.org/abs/2503.03335

10.03.2025 16:47 โ€” ๐Ÿ‘ 10    ๐Ÿ” 1    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 1

Great work by @riverdong.bsky.social - we dug deep into existing datasets & algorithms and found quite some surprising stuff

05.03.2025 16:08 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Preview
Large language models act as if they are part of a group - Nature Computational Science An extensive audit of large language models reveals that numerous models mirror the โ€˜us versus themโ€™ thinking seen in human behavior. These social prejudices are likely captured from the biased conten...

Happy to write this News & Views piece on the recent audit showing LLMs picking up "us versus them" biases: www.nature.com/articles/s43... (Read-only version: rdcu.be/d5ovo)

Check out the amazing (original) paper here: www.nature.com/articles/s43...

02.01.2025 14:11 โ€” ๐Ÿ‘ 12    ๐Ÿ” 7    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 1

I can relate :) honestly I don't think nearly enough people in the west have a clear understanding of the process of getting a EU, U.S. etc. visa sigh๐Ÿ˜ฎโ€๐Ÿ’จ

17.12.2024 14:17 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

9/9: A round of applause ๐Ÿ‘ for our stellar team: @tiancheng.bsky.social & @yarakyrychenko.bsky.social (co-leads), @steverathje.bsky.social, @nigelhcollier, @profsanderlinden.bsky.social, and @roozenbot. Special thanks to @cambridgeltl.bsky.social, @iislucas and @gatesfoundation.bsky.social.

12.12.2024 22:38 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

8/9 As AI becomes increasingly woven into the fabric of our daily lives, its biases could either amplify or help heal our social divisions. We have an opportunity - and responsibility - to ensure they don't amplify the tribal divisions that already challenge our society.

12.12.2024 22:38 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

7/9:
Key Takeaways:
1) LLMs mirror human-like social identity biases.
2) Even simple data curation can significantly reduce such biasesโ€”we should scrutinize pretraining data more!
3) We need both controlled tests & real-world interaction studies

12.12.2024 22:38 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

6/9: How do LLMs fare in real conversations? We looked at WildChat and LMSYS data and found that both users and LLMs here exhibit significant levels of ingroup and outgroup bias but actually users themselves displayed more bias than the models they interacted with!

12.12.2024 22:38 โ€” ๐Ÿ‘ 3    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

5/9 But fear not: if we remove varying proportions of โ€œbiasedโ€ sentences and fine-tune models again, we could in fact greatly reduce the ingroup and outgroup bias, even when fine-tuning on an otherwise partisan twitter corpus.

12.12.2024 22:38 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

4/9๐Ÿค” Curious about the origins of these biases in LLMs? Is it the training data's fault? After fine-tuning on a U.S. partisan Twitter corpus, we saw a large increase in ingroup solidarity and an even larger increase in outgroup hostility.

12.12.2024 22:38 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

@tiancheng is following 20 prominent accounts