jessica dai's Avatar

jessica dai

@jessica.bsky.social

go bears!!! jessicad.ai kernelmag.io

525 Followers  |  109 Following  |  78 Posts  |  Joined: 24.01.2023
Posts Following

Posts by jessica dai (@jessica.bsky.social)

congrats!!!

09.09.2025 22:48 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

so close! that's standard error ❀️

06.08.2025 17:12 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

i'm still pissed about this like the difference is literally too small to have been distinguishable with swe bench (500 samples) lmaoooo

06.08.2025 03:54 β€” πŸ‘ 8    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

hey wasn't this the same company that made a beautiful shiny "research" post about how AI evals should include error bars or something like that. or did they decide the CLT didn't apply here

06.08.2025 03:20 β€” πŸ‘ 39    πŸ” 3    πŸ’¬ 5    πŸ“Œ 1

I will be at ICML in a few weeks & would love to chat about how to make this real - I am a critic at heart and also hate self-promo so that’s how you know I really believe in this πŸ₯²

01.07.2025 23:39 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

various ways to read more πŸ˜€

blog post- argmin.net/p/individual...
position paper- arxiv.org/abs/2506.18133
fairness-oriented instantiation- arxiv.org/abs/2502.08166

& many thanks to brilliant collaborators
@rajiinio.bsky.social @irenetrampoline.bsky.social @beenwrekt.bsky.social & paula gradu !!

01.07.2025 23:39 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

lots of other stuff I won’t get into rn (e.g., I think this is a prereq to any serious attempt at β€œdemocratic” AI!), and there’s also a ton of open research questions (stats, econ/ml, empirical methods, hci, …)

01.07.2025 23:38 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

the core concept is individual reporting as a means to build collective knowledge. if one person has a bad experience, that doesn’t necessarily mean that there’s something wrong with the system β€” but if lots of people start reporting similar things, maybe we should pay attention.

01.07.2025 23:38 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

we’ve already seen this informally with the chatgpt sycophancy debacle β€” a few days of twitter virality resulted in action and statements from openai β€” but what other, subtler, patterns are happening? what could we discover if we had better ways to listen to the public?

01.07.2025 23:38 β€” πŸ‘ 2    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0
Post image Post image

individual reporting for post-deployment evals β€” a little manifesto (& new preprints!)

tldr: end users have unique insights about how deployed systems are failing; we should figure out how to translate their experiences into formal evaluations of those systems.

01.07.2025 23:38 β€” πŸ‘ 20    πŸ” 3    πŸ’¬ 1    πŸ“Œ 0
Preview
Individual experiences and collective evidence Jessica Dai on theory for the world as it could be

@jessica.bsky.social on individual reporting as a means to build collective knowledge.

24.06.2025 14:46 β€” πŸ‘ 8    πŸ” 2    πŸ’¬ 1    πŸ“Œ 0

right but one would hope that the date of doom _does_ get further away as safety research improves

bsky.app/profile/jess...

08.05.2025 21:25 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

help ..

02.05.2025 19:55 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

where are the bullshit "x% of experts believe" polls when you need them lol

24.04.2025 17:31 β€” πŸ‘ 6    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

well probably, but i wanna know how folks who do believe in that happening think about the field

19.04.2025 02:09 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

or is it a secret third thing idk. scared to ask this on Real Twitter but genuinely curious how people think about the role of this field

18.04.2025 05:45 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

like is it that the field has been ineffective (studied the wrong problems, advocated for the wrong positions, etc) or is it that every step of safety progress has been matched by 2 steps of capabilities progress (in which case, what are the best examples of safety work concretely reducing harm?)

18.04.2025 05:44 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 1

perhaps this is a stupid question but given that ai safety has been a pretty vibrant (+ well funded) field for the last 5-10 years... how should we be thinking about the concern that (ai) catastrophe still is, allegedly, imminent

18.04.2025 05:42 β€” πŸ‘ 6    πŸ” 0    πŸ’¬ 2    πŸ“Œ 0
Preview
Sam Altman on X: "we trained a new model that is good at creative writing (not sure yet how/when it will get released). this is the first time i have been really struck by something written by AI; it got the vibe of metafiction so right. PROMPT: Please write a metafictional literary short story" / X we trained a new model that is good at creative writing (not sure yet how/when it will get released). this is the first time i have been really struck by something written by AI; it got the vibe of metafiction so right. PROMPT: Please write a metafictional literary short story

in middle school we were asked to write a short story in the style of edgar allan poe. as you might expect, all of our little pieces (even, especially, the ones the students thought were "good") were hilariously bad. anyway, i had forgotten about that homework until now

x.com/sama/status/...

11.03.2025 20:09 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

back on bluesky to be mean about ai discourse

10.02.2025 18:04 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

im ngl i think this kinda just means u are stupid

10.02.2025 18:03 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
how to have a good time in a phd (not real advice)

happy new year

letterstomyfriends.substack.com/p/how-to-hav...

01.01.2025 19:04 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

i don't work well under deadline pressure but i also don't work well without it. therefore,

18.12.2024 01:01 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

... didn't we just talk about this ...

16.12.2024 23:46 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

ill read it

14.12.2024 18:23 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

The plan: Post your dissertation abstract online to rekindle a decades-long controversy about the utility of the humanities, turning your paper into the most-read publication in the history of your field

03.12.2024 23:28 β€” πŸ‘ 10    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0
Post image Post image Post image

were you born yesterday

02.12.2024 00:50 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

wait is that your house lmaooo

01.12.2024 07:25 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

where

01.12.2024 07:25 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image Post image

i just know these people would have been the biggest fans of japanese internment

01.12.2024 02:23 β€” πŸ‘ 9    πŸ” 0    πŸ’¬ 3    πŸ“Œ 0