Qingcheng Zeng @qcznlp - Bluesky Profile

Latest posts by qcznlp.bsky.social on Bluesky

This Friday’s my last day at Tencent AI Lab Seattle! I’ll be around till Sep 10, then flying to SF. This fall I’m doing another internship at Snowflake. Want to hang before I leave Seattle or after I land in the Bay? DM me!

28.08.2025 02:13 — 👍 3 🔁 0 💬 0 📌 0

4️⃣ Good Intentions Beyond ACL: Who Does NLP for Social Good, and Where?

The first jump into Science of Science! We systematically investigated the NLP4SG landscape and quantified the proportion of work addressing social good concerns both within and beyond the ACL community. Preprint coming soon!

20.08.2025 20:47 — 👍 2 🔁 0 💬 0 📌 0

MMLU-ProX: A Multilingual Benchmark for Advanced Large Language Model Evaluation Existing large language model (LLM) evaluation benchmarks primarily focus on English, while current multilingual tasks lack parallel questions that specifically assess cross-linguistic reasoning abili...

3️⃣ MMLU-ProX: A Multilingual Benchmark for Advanced Large Language Model Evaluation

By far, the most comprehensive multilingual benchmark for evaluating LLMs. Qwen 3 2507 is using this benchmark to evaluate multilingual ability!

Paper 3️⃣: arxiv.org/abs/2503.10497

20.08.2025 20:47 — 👍 0 🔁 0 💬 1 📌 0

Thinking Out Loud: Do Reasoning Models Know When They're Right? Large reasoning models (LRMs) have recently demonstrated impressive capabilities in complex reasoning tasks by leveraging increased test-time computation and exhibiting behaviors reminiscent of human-...

(3) Instruct models show much higher refusal rates than reasoning models. And reasoning models only show minimal accuracy in additional attempts.
(4) Thinking with images helps SO much in VLMs' calibration!

Paper1️⃣: arxiv.org/abs/2504.06564
Paper2️⃣: arxiv.org/abs/2505.20236

20.08.2025 20:47 — 👍 1 🔁 0 💬 1 📌 0

...whether reasoning models or vision language models express their confidence in a calibrated manner. Our findings are:
(1) SFT reasoning models usually lead to better calibration in in-distribution settings, and worse calibration in OOD settings.
(2) RL could help improve(recover) a bit.
...

20.08.2025 20:47 — 👍 0 🔁 0 💬 1 📌 0

Four papers accepted at the #EMNLP2025 main conference!
1️⃣ Thinking Out Loud: Do Reasoning Models Know When They’re Right?
2️⃣ Seeing is Believing, but How Much? A Comprehensive Analysis of Verbalized Calibration in Vision-Language Models

In these two papers, we look into...

20.08.2025 20:47 — 👍 10 🔁 3 💬 1 📌 0

My bsky "Discover" tab is full of kittens and puppies. That makes my day.

11.08.2025 19:45 — 👍 2 🔁 0 💬 0 📌 0

Our work on pragmatic competence in LLMs was accepted for PragLM@COLM 2025. Preprint: arxiv.org/abs/2505.18497. Hope we can have someone in Montreal to tell you how much we love this work!

29.07.2025 18:45 — 👍 4 🔁 1 💬 0 📌 0

Leveraging Human Production-Interpretation Asymmetries to Test LLM Cognitive Plausibility Suet-Ying Lam, Qingcheng Zeng, Jingyi Wu, Rob Voigt. Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). 2025.

I just gave a virtual presentation at ACL 2025 on our work about the production–interpretation asymmetry in reference processing in LLMs. If you’re into computational psycholinguistics or the LLMs x cognitive science space, give it a read!
aclanthology.org/2025.acl-sho...
@robvoigt.bsky.social

29.07.2025 16:21 — 👍 9 🔁 3 💬 0 📌 0

Thanks for reading!! Any feedback will be greatly appreciated if you happen to have🫡

21.07.2025 03:16 — 👍 0 🔁 0 💬 0 📌 0

Fascinating work! If you're open to talk, I’d love to chat sometime about the broader potential of LLMs in social science.

07.05.2025 20:04 — 👍 1 🔁 0 💬 0 📌 0

@qcznlp is following 20 prominent accounts

Christian Ilbury
@christianilbury

Senior Lecturer in Sociolinguistics & Director of EDI for PPLS at University of Edinburgh. he/him LVC, digital (queer+youth+popular) cultures, 'MLE', & accent bias/linguistic discrimination - Edinburgh/London. https://cilbury.wordpress.com/

Tanise Ceron
@taniseceron

Postdoc @milanlp.bsky.social

Nina Beguš
@ninabegus

UC Berkeley - InterpretAI - Artificial Humanities Book (40% off code PREORDERS25): https://press.umich.edu/Books/A/Artificial-Humanities3

David Bau
@davidbau

Interpretable Deep Networks. http://baulab.info/ @davidbau

Sheridan Feucht @ COLM
@sfeucht

PhD student doing LLM interpretability with @davidbau.bsky.social and @byron.bsky.social. (they/them) https://sfeucht.github.io

Anne Lauscher
@a-lauscher

Professor of Data Science Lead of @ds-hamburg.bsky.social Researching Safe Generative AI

Jennifer Hu @ COLM (recruiting PhDs and postdocs!)
@jennhu

Asst Prof at Johns Hopkins Cognitive Science • Director of the Group for Language and Intelligence (GLINT) ✨• Interested in all things language, cognition, and AI jennhu.github.io

Mark Riedl
@markriedl

AI for storytelling, games, explainability, safety, ethics. Professor at Georgia Tech. Associate Director of ML Center at GT. Time travel expert. Geek. Dad. he/him

Shubhendu Trivedi
@shubhendu

Interests on bsky: ML research, applied math, and general mathematical and engineering miscellany. Also: Uncertainty, symmetry in ML, reliable deployment; applications in LLMs, computational chemistry/physics, and healthcare. https://shubhendu-trivedi.org

Jessica Hullman
@jessicahullman

Ginni Rometty Prof @NorthwesternCS | Fellow @NU_IPR | Uncertainty + decisions | Humans + AI/ML | Blog @statmodeling

Jamie Reilly 🦜
@reilly-coglab.com

Professor @ Temple University: neuro/psycholinguistics, semantic memory, dementia, neurorehabilitation, nlp, pupillometry, photography, art, horror

Ted Underwood
@tedunderwood.com

Uses machine learning to study literary imagination, and vice-versa. Likely to share news about AI & computational social science / Sozialwissenschaft / 社会科学 Information Sciences and English, UIUC. Distant Horizons (Chicago, 2019). tedunderwood.com

Kristina Gligoric
@gligoric

Assistant Professor of Computer Science @JohnsHopkins, CS Postdoc @Stanford, PHD @EPFL, Computational Social Science, NLP, AI & Society https://kristinagligoric.com/

Peter Henderson
@peterhenderson

Assistant Professor the Polaris Lab @ Princeton (https://www.polarislab.org/); Researching: RL, Strategic Decision-Making+Exploration; AI+Law

Omar Shaikh
@oshaikh

Ph.D. Student at Stanford HCI/NLP oshaikh.com interested in: ice cream, drinking coffee / chai, complaining, (conversational) grounding, human-ai interaction

Marianne de Heer Kloots
@mdhk.net

Linguist in AI & CogSci 🧠👩‍💻🤖 PhD student @ ILLC, University of Amsterdam 🌐 https://mdhk.net/ 🐘 https://scholar.social/@mdhk 🐦 https://twitter.com/mariannedhk

Sekh (Sk) Mainul Islam
@sekh-copenlu

PhD Fellow at the CopeNLU Group, University of Copenhagen; working on explainable automatic fact-checking . Prev: NYU Abu Dhabi, IIT Kharagpur. https://mainuliitkgp.github.io/

Jon Brennan
@jonrbrennan

Linguistics and cognitive science nerd; faculty at the University of Michigan; author of "Language and the Brain"

Laura K. Nelson
@lauraknelson

Associate Professor @ UBC computational sociology machine learning is feminist You only have to look at the Medusa straight on to see her. And she’s not deadly. She’s beautiful and she’s laughing. www.lauraknelson.com

Antoine Bosselut
@abosselut

Helping machines make sense of the world. Asst Prof @icepfl.bsky.social; Before: @stanfordnlp.bsky.social @uwnlp.bsky.social AI2 #NLProc #AI Website: https://atcbosselut.github.io/