Weiqiu You (@youweiqiu) — Bluesky Profile

4 months ago

Huge thanks to all my collaborators @antonxue.bsky.social, @shreyahavaldar.bsky.social , @deliprao.bsky.social , @helenjin.bsky.social, Chris Callison-Burch, @riceric22.bsky.social

0 0 0 0

4 months ago

LLMs often make reasoning errors. However, current LLM error detection methods often fail when earlier errors corrupt downstream judgments. We introduce Autoregressive Reasoning Entailment Stability (ARES), an framework for measuring reasoning soundness with stability guarantees.

0 0 1 0

4 months ago

I'll be presenting our work "Probabilistic Soundness Guarantees in LLM Reasoning Chains" at EMNLP 2025

Today (Nov 5) Hall C 14:30-16:00 802-Main

Blog: debugml.github.io/ares
Paper: arxiv.org/abs/2507.12948
Code: github.com/fallcat/ares

1 0 1 1

7 months ago

Joint works with @profericwong.bsky.social @antonxue.bsky.social @shreyahavaldar.bsky.social @helenjin.bsky.social @deliprao.bsky.social Chris Callison-Burch, Helen Qu, Marco Gatti, Bhuvnesh Jain

0 0 0 0

7 months ago

See our posters at Actionable Interpretability workshop
East Ballroom A

Probabilistic Soundness Guarantees in LLM Reasoning Chains
arxiv.org/abs/2507.12948
10:40-11:40am

Sum-of-Parts: Self-Attributing Neural Networks with End-to-End Learning of Feature Group
debugml.github.io/sum-of-parts
1-2pm

3 0 1 0