Emma Pierson @emmapierson - Bluesky Profile

Thanks - super-interesting, and actually very relevant to some other work we're doing as well. Will pass along!

29.11.2025 16:53 — 👍 2 🔁 0 💬 0 📌 0

Testing for racial bias using inconsistent perceptions of race You have to enable JavaScript in your browser's settings in order to use the eReader.

We're excited about applications of our test to other datasets that have 1) perceptions of race, gender, etc and 2) multiple observations of the same person.

This work is led by the wonderful Nora Gera, in a great start to her PhD!

Full paper: www.science.org/doi/epdf/10....

24.11.2025 18:14 — 👍 4 🔁 0 💬 1 📌 0

See the paper for many robustness checks and discussion of nuances! Our finding persists when using alternate outcomes, statistical models, subsets of the data, and controls satisfying the criteria above.

5/

24.11.2025 18:14 — 👍 1 🔁 0 💬 1 📌 0

A benefit of our test is that it doesn't require us to control for all factors legitimately influencing searches. We only have to control for things that influence both searches and perceived race, vary for the same person across stops, and don't themselves suggest bias.

4/

24.11.2025 18:14 — 👍 2 🔁 0 💬 1 📌 0

9% of drivers stopped multiple times have inconsistently perceived race across different stops - most perceived as both white + Hispanic.

When perceived as Hispanic, the same driver is likelier to be searched/arrested. This gap is substantial (24% of overall search rate).

3/

24.11.2025 18:14 — 👍 1 🔁 0 💬 1 📌 0

Tests for racial bias often compare how two people of different races are treated.

But two people typically differ in many ways besides race.

So instead of comparing two different people, we study the *same person over time*, as perceptions of their race change.

2/

24.11.2025 18:14 — 👍 2 🔁 0 💬 1 📌 0

We have a new paper in Science Advances proposing a simple test for bias:

Is the same person treated differently when their race is perceived differently?

Specifically, we study: is the same driver likelier to be searched by police when they are perceived as Hispanic rather than white?

1/

24.11.2025 18:14 — 👍 44 🔁 17 💬 2 📌 1

New #NeurIPS2025 paper: how should we evaluate machine learning models without a large, labeled dataset? We introduce Semi-Supervised Model Evaluation (SSME), which uses labeled and unlabeled data to estimate performance! We find SSME is far more accurate than standard methods.

17.10.2025 16:29 — 👍 22 🔁 7 💬 1 📌 4

selfishly i wish we could keep divya in our lab forever but i guess it would be a disservice to the rest of the world 😅 she’s been such a wonderful mentor to me—i’ve learned a lot from how thoughtful, creative, and knowledgeable she is about everything. she’s also super funny and amazing at baking 🤭

14.10.2025 17:14 — 👍 6 🔁 1 💬 1 📌 0

Meeting Divya 5 years ago was one of the biggest strokes of luck in my faculty career - she is a brilliant scientist who has been foundational to so many of our lab's projects, and any institution would be lucky to hire her.

14.10.2025 16:01 — 👍 6 🔁 0 💬 1 📌 0

Postdoctoral Employee - Artificial Intelligence - Electrical Engineering and Computer Sciences Department University of California, Berkeley is hiring. Apply now!

Apply here - aprecruit.berkeley.edu/JPF05028 by 11/15, but review of applications is ongoing so sooner is better! (Application deadline currently says 9/15 but will be extended).

22.08.2025 14:11 — 👍 3 🔁 0 💬 0 📌 0

Sparse Autoencoders for Hypothesis Generation We describe HypotheSAEs, a general method to hypothesize interpretable relationships between text data (e.g., headlines) and a target variable (e.g., clicks). HypotheSAEs has three steps: (1) train a ...

Broad project areas include:

1) language modelling methods for scientific discovery (building on our recent work - arxiv.org/abs/2502.04382)

2) using language models to support equity (ai.nejm.org/doi/full/10....)

both in collaboration with health+social scientists.

2/3

22.08.2025 14:11 — 👍 4 🔁 0 💬 1 📌 0

🚨 New postdoc position in our lab at Berkeley EECS! 🚨

(please reshare)

We seek applicants with experience in language modeling who are excited about high-impact applications in the health and social sciences!

More info in thread

1/3

22.08.2025 14:11 — 👍 21 🔁 12 💬 1 📌 3

📢New POSITION PAPER: Use Sparse Autoencoders to Discover Unknown Concepts, Not to Act on Known Concepts

Despite recent results, SAEs aren't dead! They can still be useful to mech interp, and also much more broadly: across FAccT, computational social science, and ML4H. 🧵

05.08.2025 16:31 — 👍 40 🔁 4 💬 1 📌 3

SF fog coming up to swallow us in time lapse.

07.07.2025 03:19 — 👍 7 🔁 0 💬 0 📌 0

Honored to win a #CHIL2025 best paper award for our work modeling inequality in disease progression, led by @ericachiang.bsky.social!

To the NIH: health inequality remains a vital topic to support the health of all Americans. As we prove, failing to account for it biases estimates for everyone.

27.06.2025 18:00 — 👍 10 🔁 1 💬 0 📌 0

For folks at @facct.bsky.social, our very own @cornellbowers.bsky.social student @emmharv.bsky.social will present the Best-Paper-Award-winning work she led on Wednesday at 10:45 AM in the "Audit and Evaluation Approaches" session!

In the meantime, 🧵 below and 🔗 here: arxiv.org/abs/2506.04419 !

23.06.2025 14:49 — 👍 16 🔁 2 💬 1 📌 0

assassinations, handcuffing a senator at press conference, marines detaining a civilian, and a military parade for the president’s birthday. rough week for democracy.

14.06.2025 15:14 — 👍 8413 🔁 2195 💬 90 📌 51

and... here is the actual GIF 🙈

14.06.2025 17:04 — 👍 3 🔁 1 💬 0 📌 0

The first paper of @ericachiang.bsky.social's PhD, just accepted at #CHIL2025, proposes a model of disease progression which estimates and accounts for 3 types of health disparities to more accurately measure disease severity. See her full thread below!

01.05.2025 15:53 — 👍 11 🔁 0 💬 0 📌 0

Thanks, Megan!! This is kind :) hope you’re doing well.

26.04.2025 22:11 — 👍 2 🔁 0 💬 0 📌 0

My ‘woke DEI’ grant has been flagged for scrutiny. Where do I go from here? My work in making artificial intelligence fair has been noticed by US officials intent on ending ‘class warfare propaganda’.

The US government recently flagged my scientific grant in its "woke DEI database". Many people have asked me what I will do.

My answer today in Nature.

We will not be cowed. We will keep using AI to build a fairer, healthier world.

www.nature.com/articles/d41...

25.04.2025 17:19 — 👍 40 🔁 13 💬 1 📌 1

A pleasure to join the Tech Policy Press podcast with @natematias.bsky.social, @geomblog.bsky.social, and @justinhendrix.bsky.social to defend the consensus that AI bias is an important concern.

24.04.2025 16:20 — 👍 17 🔁 7 💬 0 📌 0

Lab had dogathon! Seminal dog discoveries ensued.

02.04.2025 15:10 — 👍 6 🔁 0 💬 0 📌 0

This work is led by @gsagostini.bsky.social, who gets more excited about geospatial data than anyone I've ever met, and with Rachel Young, Maria Fitzpatrick, and @nkgarg.bsky.social.

Paper: arxiv.org/abs/2503.20989
Website (and data): migrate.tech.cornell.edu
Thread: bsky.app/profile/gsag...

28.03.2025 16:04 — 👍 6 🔁 0 💬 0 📌 0

Migration data is critical in the health, environmental, and social sciences.

We're releasing a new dataset, MIGRATE: annual flows between 47 billion pairs of US Census areas. MIGRATE is:

- 4600x more granular than existing public data
- highly correlated with external ground-truth data

1/2

28.03.2025 16:04 — 👍 28 🔁 4 💬 2 📌 0

💡New preprint & Python package: We use sparse autoencoders to generate hypotheses from large text datasets.

Our method, HypotheSAEs, produces interpretable text features that predict a target variable, e.g. features in news headlines that predict engagement. 🧵1/

18.03.2025 15:17 — 👍 40 🔁 13 💬 1 📌 3

This work is led by the wonderful @rajmovva.bsky.social and @kennypeng.bsky.social with coauthors @nkgarg.bsky.social and Jon Kleinberg. See Raj’s full thread for details, Python package, and project website!

bsky.app/profile/rajm...

18.03.2025 18:26 — 👍 2 🔁 0 💬 1 📌 0

HypotheSAEs outperforms strong LLM baselines, generates new discoveries even on well-studied datasets, and comes with easy-to-use code.

We hope this will be helpful not just to CS folks, but to many in social/health sciences - please reshare to help reach them.

18.03.2025 18:26 — 👍 1 🔁 0 💬 1 📌 0

We have a new method, HypotheSAEs, for identifying *interpretable text features that predict a target variable* (aka hypothesis generation).

What features of a headline predict engagement?

What features of a clinical note predict whether a patient will develop cancer?

1/

18.03.2025 18:26 — 👍 8 🔁 1 💬 1 📌 0

Emma Pierson

Latest posts by emmapierson.bsky.social on Bluesky

@emmapierson is following 19 prominent accounts