's Avatar

@chenhaotan.bsky.social

Associate professor at the University of Chicago. Working on human-centered AI, NLP, CSS. https://chenhaot.com, https://substack.com/@cichicago

4,085 Followers  |  310 Following  |  304 Posts  |  Joined: 08.11.2023
Posts Following

Posts by (@chenhaotan.bsky.social)

Local ballot measures are now on CivicChats! Local elections happen year-round, 10+ states have measures coming up in the next few months. Check your ballot and think through what you'll be voting on โ†’ civicchats.org

25.02.2026 18:35 โ€” ๐Ÿ‘ 2    ๐Ÿ” 1    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Preview
CivicChats - Building AI to support voting behavior CivicChats is a platform for exploring, debating, and thinking through upcoming ballot measures.

We have been developing automatic evaluation based on checklists. We are also planning to run a study at the same time. Learn more at the end of this blog: cichicago.substack.com/p/civicchats...

20.02.2026 00:33 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Check out our effort in thinking about how AI can help with democratic processes!

19.02.2026 21:48 โ€” ๐Ÿ‘ 1    ๐Ÿ” 1    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

Anyone can help reviewing an ACL submission today on parameter efficient fine-tuning?

Sorry that it is very tight.

16.02.2026 19:27 โ€” ๐Ÿ‘ 0    ๐Ÿ” 1    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Post image

๐Ÿ“– โ‰  ๐Ÿงช The Story is Not the Science.
Code is submitted but rarely executed during peer reviewโ€”an issue likely to worsen with research agents. ๐Ÿง‘โ€๐Ÿ”ฌ
We introduce ๐Œ๐ž๐œ๐ก๐„๐ฏ๐š๐ฅ๐€๐ ๐ž๐ง๐ญ, an execution-grounded evaluation of narrative + execution. ๐•๐ž๐ซ๐ข๐Ÿ๐ฒ ๐ญ๐ก๐ž ๐ฌ๐œ๐ข๐ž๐ง๐œ๐ž, ๐ง๐จ๐ญ ๐ฃ๐ฎ๐ฌ๐ญ ๐ญ๐ก๐ž ๐ฌ๐ญ๐จ๐ซ๐ฒ.
1/n

10.02.2026 19:44 โ€” ๐Ÿ‘ 8    ๐Ÿ” 4    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 0
Post image

Mark Yatskar will be speaking this Friday!

You can tune in either on

Zoom: uchicago.zoom.us/j/9897879984...
Youtube: www.youtube.com/@AIScientifi...

09.02.2026 21:09 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Post image

Hannes Stark will be speaking this Friday on BoltzGen!

You can tune in either on

Zoom: uchicago.zoom.us/j/9897879984...
Youtube: www.youtube.com/@AIScientifi...

02.02.2026 22:47 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 1

Happening in three hours!

30.01.2026 14:03 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Microsoft Research NYC is hiringย a researcher in the space of AI and society!

29.01.2026 23:27 โ€” ๐Ÿ‘ 62    ๐Ÿ” 40    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 2
Post image

@profbuehlermit.bsky.social from MIT will be speaking this Friday!

You can tune in either on

Zoom: uchicago.zoom.us/j/9897879984...
Youtube: www.youtube.com/@AIScientifi...

26.01.2026 20:42 โ€” ๐Ÿ‘ 5    ๐Ÿ” 1    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 1

Happening in two hours!

23.01.2026 15:03 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Post image

Peter Clark from @ai2.bsky.social will be speaking on Friday!

You can tune in either on

Zoom: uchicago.zoom.us/j/9897879984...
Youtube: www.youtube.com/@AIScientifi...

20.01.2026 19:17 โ€” ๐Ÿ‘ 3    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 1
We study how radiologists use AI to diagnose pulmonary embolism (PE), tracking over 100,000
scans interpreted by nearly 400 radiologists during the staggered rollout of an FDA-approved
diagnostic platform. When AI flags PE, radiologists agree 84% of the time; when AI predicts no PE,
they agree 97%. Disagreement evolves substantially: radiologists initially reject AI-positive PEs in
30% of cases, dropping to 12% by year two. Despite a 16% increase in scan volume, diagnostic speed
remains stable while per-radiologist monthly volumes nearly double, with no change in patient
mortalityโ€”suggesting AI improves workflow without compromising outcomes. We document
significant heterogeneity in AI collaboration: some radiologists reject AI-flagged PEs half the time
while others accept nearly always; female radiologists are 6 percentage points less likely to override AI
than male radiologists. Moderate AI engagement is associated with the highest agreement, whereas
both low and high engagement show more disagreement. Follow-up imaging reveals that when
radiologists override AI to diagnose PE, 54% of subsequent scans show both agreeing on no PE
within 30 days.

We study how radiologists use AI to diagnose pulmonary embolism (PE), tracking over 100,000 scans interpreted by nearly 400 radiologists during the staggered rollout of an FDA-approved diagnostic platform. When AI flags PE, radiologists agree 84% of the time; when AI predicts no PE, they agree 97%. Disagreement evolves substantially: radiologists initially reject AI-positive PEs in 30% of cases, dropping to 12% by year two. Despite a 16% increase in scan volume, diagnostic speed remains stable while per-radiologist monthly volumes nearly double, with no change in patient mortalityโ€”suggesting AI improves workflow without compromising outcomes. We document significant heterogeneity in AI collaboration: some radiologists reject AI-flagged PEs half the time while others accept nearly always; female radiologists are 6 percentage points less likely to override AI than male radiologists. Moderate AI engagement is associated with the highest agreement, whereas both low and high engagement show more disagreement. Follow-up imaging reveals that when radiologists override AI to diagnose PE, 54% of subsequent scans show both agreeing on no PE within 30 days.

Posted a very early stage draft with rock star collaborators.

Key question: when we actually roll out AI tools, how do people use them? Do they just defer completely? Does it improve productivity and ability?

We look in the medical setting of pulmonary embolisms
paulgp.com/papers/Radio...

19.01.2026 20:16 โ€” ๐Ÿ‘ 89    ๐Ÿ” 18    ๐Ÿ’ฌ 4    ๐Ÿ“Œ 2

I've often joked that as faculty I program in a high-level language called "graduate student". Having tried out Claude Code this morning, I (i) feel extremely at home, (ii) am realizing that research-by-graduate-student is perhaps the original vibe-coding. 1/2

08.01.2026 12:24 โ€” ๐Ÿ‘ 86    ๐Ÿ” 11    ๐Ÿ’ฌ 7    ๐Ÿ“Œ 3

I've seen this message and similar echos for other writing, and I want strongly push back on this narrative. It's not that you shouldn't use ChatGPT but that you shouldn't *use ChatGPT to write it for you*. ChatGPTโ€”and AI in generalโ€”is not a monolith. How you use it matters.

18.01.2026 16:55 โ€” ๐Ÿ‘ 8    ๐Ÿ” 1    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

Very much enjoyed this talk by @yisongyue.bsky.social ! The measurement challenge deserves a lot more attention from the AI community!

16.01.2026 18:54 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Happening in two hours!

16.01.2026 14:43 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 1
Title + abstract of the preprint

Title + abstract of the preprint

Excited to present a new preprint with @nkgarg.bsky.social: presenting usage statistics and observational findings from Paper Skygest in the first six months of deployment! ๐ŸŽ‰๐Ÿ“œ

arxiv.org/abs/2601.04253

14.01.2026 19:48 โ€” ๐Ÿ‘ 147    ๐Ÿ” 45    ๐Ÿ’ฌ 4    ๐Ÿ“Œ 4
Post image Post image

Emergent misalignment made into @nature.com! The key insight is that models fine-tuned on writing insecure code present a wide range of insecure behavior in other contexts.

15.01.2026 15:37 โ€” ๐Ÿ‘ 5    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

I think it would be useful to attract researchers in industry to the platform as well.

13.01.2026 01:28 โ€” ๐Ÿ‘ 3    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Post image

Join us on Friday, January 16th for this weekโ€™s AI & Scientific Discovery Online Seminar!

Featuring Yisong Yue, Professor of Computing and Mathematical Sciences at @caltech.edu

You can participate online or in-person at DSI. Learn more at ai-scientific-discovery.github.io

12.01.2026 20:15 โ€” ๐Ÿ‘ 2    ๐Ÿ” 1    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Post image

Professor @yisongyue.bsky.social will be speaking this Friday at the AI & Scientific Discovery seminar!

Zoom: uchicago.zoom.us/j/9897879984...
Youtube: www.youtube.com/@AIScientifi...

12.01.2026 18:16 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 2
Archive - Ari or Chenhao?

chenhaot.com/ariorchenhao...

12.01.2026 15:24 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

I certainly did not realize that this is such a controversial take:

Chenhao: "Finding the equilibrium of publishing will take at least a decade."
25% agree, 75% disagree

12.01.2026 15:24 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Video thumbnail

Let us all stand with Chairman Powell.

12.01.2026 02:36 โ€” ๐Ÿ‘ 33408    ๐Ÿ” 8707    ๐Ÿ’ฌ 985    ๐Ÿ“Œ 568
Screenshot of Chinese calligraphy reader web application

Screenshot of Chinese calligraphy reader web application

I can't read Chinese, but my family has old genealogy documents I've always wanted to understand. Claude and Gemini helped me build an interactive reader to explore the calligraphy character by character.

I can finally read my great-grandfather's epitaph. Try it:
davidbau.com/archives/202...

12.01.2026 03:12 โ€” ๐Ÿ‘ 25    ๐Ÿ” 4    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 2

in general yes, but yonatan requested this not be recorded.

11.01.2026 20:50 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

What is the name of the tool?

11.01.2026 15:11 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 0

The recent days have been horrific. We can't become numb to repeated instances of illegal and unconstitutional action by government agencies. It's even worse when public officials are blatantly lying in ways that contradict dozens of pieces of video evidence.

09.01.2026 21:40 โ€” ๐Ÿ‘ 197    ๐Ÿ” 25    ๐Ÿ’ฌ 5    ๐Ÿ“Œ 0