Aaron Schein's Avatar

Aaron Schein

@aaronschein.bsky.social

Assistant Professor of Statistics & Data Science at UChicago Topics: data-intensive social science, Bayesian statistics, causal inference, probabilistic ML Proud โ€œgolden retrieverโ€ ๐Ÿฆฎ

1,463 Followers  |  382 Following  |  64 Posts  |  Joined: 12.11.2024  |  2.2611

Latest posts by aaronschein.bsky.social on Bluesky

Post image

I am delighted to share our new #PNAS paper, with @grvkamath.bsky.social @msonderegger.bsky.social and @sivareddyg.bsky.social, on whether age matters for the adoption of new meanings. That is, as words change meaning, does the rate of adoption vary across generations? www.pnas.org/doi/epdf/10....

29.07.2025 12:31 โ€” ๐Ÿ‘ 48    ๐Ÿ” 13    ๐Ÿ’ฌ 3    ๐Ÿ“Œ 1
Post image

Southwest Airlines appears to have rewritten the synopses of its inflight entertainment using AI. The Chinese vice-premier isnโ€™t even a character in this movie!

20.07.2025 19:35 โ€” ๐Ÿ‘ 3    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 1

Also might this be the first recorded instance of "overlapping communities" in social science? A good question for @azjacobs.bsky.social.

12.06.2025 19:56 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Post image

Before gesturing to the good ol' days when industry didn't suck up our greatest minds, consider this article I just came across in a 1929 issue of JASA which found that 70% of statisticians had "no other [scientific] affiliation [...] possibly because of [their interest] in business enterprises".

12.06.2025 19:53 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

I'd like to see a revival of panache and artistry in scientific prose style. Since we have to read so many papers, they should be fun and beautiful. I would also argue that this serves the goal of communication: readers will be more likely to remember a striking phrase or image.

31.03.2025 11:07 โ€” ๐Ÿ‘ 134    ๐Ÿ” 22    ๐Ÿ’ฌ 13    ๐Ÿ“Œ 3
Paper screenshot. Title: Addressing discretization-induced bias in demographic prediction 


Abstract: Racial and other demographic imputation is necessary for many applications, especially in auditing disparities and outreach targeting in political campaigns. The canonical approach is to construct continuous predictionsโ€”e.g. based on name and geographyโ€”and then to often discretize the predictions by selecting the most likely class (argmax), potentially with a minimum threshold (thresholding). We study how this practice produces discretization bias. For example, we show that argmax labeling, as used by a prominent commercial voter file vendor to impute race/ethnicity, results in a substantial under-count of Black voters, e.g. by 28.2% points in North Carolina. This bias can have substantial implications in downstream tasks that use such labels. We then introduce a joint optimization approachโ€”and a tractable data-driven threshold heuristicโ€”that can eliminate this bias, with negligible individual-level accuracy loss. Finally, we theoretically analyze discretization bias, show that calibrated continuous models are insufficient to eliminate it, and that an approach such as ours is necessary. Broadly, we warn researchers and practitioners against discretizing continuous demographic predictions without considering downstream consequences.

Paper screenshot. Title: Addressing discretization-induced bias in demographic prediction Abstract: Racial and other demographic imputation is necessary for many applications, especially in auditing disparities and outreach targeting in political campaigns. The canonical approach is to construct continuous predictionsโ€”e.g. based on name and geographyโ€”and then to often discretize the predictions by selecting the most likely class (argmax), potentially with a minimum threshold (thresholding). We study how this practice produces discretization bias. For example, we show that argmax labeling, as used by a prominent commercial voter file vendor to impute race/ethnicity, results in a substantial under-count of Black voters, e.g. by 28.2% points in North Carolina. This bias can have substantial implications in downstream tasks that use such labels. We then introduce a joint optimization approachโ€”and a tractable data-driven threshold heuristicโ€”that can eliminate this bias, with negligible individual-level accuracy loss. Finally, we theoretically analyze discretization bias, show that calibrated continuous models are insufficient to eliminate it, and that an approach such as ours is necessary. Broadly, we warn researchers and practitioners against discretizing continuous demographic predictions without considering downstream consequences.

Now online @pnasnexus.org! Many discrimination auditing and electoral tasks use ML to predict race/ethnicity โ€“ by discretizing continuous scores. Can the discretization process cause bias in labels and downstream tasks? Yes! Led by @evandyx.bsky.social

academic.oup.com/pnasnexus/ar...

27.02.2025 19:43 โ€” ๐Ÿ‘ 28    ๐Ÿ” 5    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Screenshot of top half of first page of paper. The paper is titled: "When People are Floods: Analyzing Dehumanizing Metaphors in Immigration Discourse with Large Language Models". The authors are Julia Mendelsohn (University of Chicago) and Ceren Budak (University of Michigan). The top right corner contains a visual showing the sentence "They want immigrants to pour into and infest this country". The caption says: Figure 1: Dehumanizing sentence likening immigrants to the source domain concepts of Water and Vermin via the words "pour" and "infest". 

The abstract text on the left reads: Metaphor, discussing one concept in terms of another, is abundant in politics and can shape how people understand important issues. We develop a computational approach to measure metaphorical language, focusing on immigration discourse on social media. Grounded in qualitative social science research, we identify seven concepts evoked in immigration discourse (e.g. "water" or "vermin"). We propose and evaluate a novel technique that leverages both word-level and document-level signals to measure metaphor with respect to these concepts. We then study the relationship between metaphor, political ideology, and user engagement in 400K US tweets about immigration. While conservatives tend to use dehumanizing metaphors more than liberals, this effect varies widely across concepts. Moreover, creature-related metaphor is associated with more retweets, especially for liberal authors. Our work highlights the potential for computational methods to complement qualitative approaches in understanding subtle and implicit language in political discourse.

Screenshot of top half of first page of paper. The paper is titled: "When People are Floods: Analyzing Dehumanizing Metaphors in Immigration Discourse with Large Language Models". The authors are Julia Mendelsohn (University of Chicago) and Ceren Budak (University of Michigan). The top right corner contains a visual showing the sentence "They want immigrants to pour into and infest this country". The caption says: Figure 1: Dehumanizing sentence likening immigrants to the source domain concepts of Water and Vermin via the words "pour" and "infest". The abstract text on the left reads: Metaphor, discussing one concept in terms of another, is abundant in politics and can shape how people understand important issues. We develop a computational approach to measure metaphorical language, focusing on immigration discourse on social media. Grounded in qualitative social science research, we identify seven concepts evoked in immigration discourse (e.g. "water" or "vermin"). We propose and evaluate a novel technique that leverages both word-level and document-level signals to measure metaphor with respect to these concepts. We then study the relationship between metaphor, political ideology, and user engagement in 400K US tweets about immigration. While conservatives tend to use dehumanizing metaphors more than liberals, this effect varies widely across concepts. Moreover, creature-related metaphor is associated with more retweets, especially for liberal authors. Our work highlights the potential for computational methods to complement qualitative approaches in understanding subtle and implicit language in political discourse.

New preprint!
Metaphors shape how people understand politics, but measuring them (& their real-world effects) is hard.

We develop a new method to measure metaphor & use it to study dehumanizing metaphor in 400K immigration tweets Link: bit.ly/4i3PGm3

#NLP #NLProc #polisky #polcom #compsocialsci
๐Ÿฆ๐Ÿฆ

20.02.2025 19:59 โ€” ๐Ÿ‘ 180    ๐Ÿ” 64    ๐Ÿ’ฌ 6    ๐Ÿ“Œ 11
Post image

Upon learning that yesterday would be my last day as a program officer at the National Science Foundation, I shared this parting message with my colleagues. The next few months will be frenetic and stressful for them. Here are some things that you can do to help them with the mission ahead. (1)

19.02.2025 19:08 โ€” ๐Ÿ‘ 2429    ๐Ÿ” 830    ๐Ÿ’ฌ 69    ๐Ÿ“Œ 70

Thatโ€™s not true in my experience (I am a researcher in the area)

13.02.2025 23:35 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Another thing all these tech leaders share is a strong financial incentive to publicly endorse such a belief, regardless of whether their private information supports it.

13.02.2025 16:12 โ€” ๐Ÿ‘ 5    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

Ah true!

25.01.2025 00:09 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

I donโ€™t think the US has formally declared war since WW2. The executive has extremely loose military power, regardless of Congress.

24.01.2025 23:40 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

Whoa! What language? And were the dtypes of n and y different?

06.01.2025 22:48 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Title and Abstract from article in the American Political Science Review. 

Title: The Vietnam Draft Lottery and Whitesโ€™ Racial Attitudes: Evidence from the General Social Survey

by DONALD P. GREEN Columbia University, and OLIVER HYMAN-METZGER Columbia University

Abstract: The Vietnam Draft Lotteries, which randomly assigned men to military service, enable researchers to assess the long-term effects of interracial contact on racial attitudes. Using a new draft status indicator for respondents to the General Social Surveys 1978โ€“2021, we show that white men who were selected for the draft subsequently expressed less negative attitudes toward Black people and toward policies designed to help them. These effects are apparent only for cohorts that were actually drafted into service, suggesting that interracial contact during military service led to attitude change. These findings have important implications for theories of political socialization and prejudice reduction.

Title and Abstract from article in the American Political Science Review. Title: The Vietnam Draft Lottery and Whitesโ€™ Racial Attitudes: Evidence from the General Social Survey by DONALD P. GREEN Columbia University, and OLIVER HYMAN-METZGER Columbia University Abstract: The Vietnam Draft Lotteries, which randomly assigned men to military service, enable researchers to assess the long-term effects of interracial contact on racial attitudes. Using a new draft status indicator for respondents to the General Social Surveys 1978โ€“2021, we show that white men who were selected for the draft subsequently expressed less negative attitudes toward Black people and toward policies designed to help them. These effects are apparent only for cohorts that were actually drafted into service, suggesting that interracial contact during military service led to attitude change. These findings have important implications for theories of political socialization and prejudice reduction.

New study looks at Vietnam Draft Lotteries to test effects of โ€œinterracial contact on racial attitudes.โ€ Finds โ€œwhite men who were selected for the draft subsequently expressed less negative attitudes toward Black people and toward policies designed to help them.โ€ www.cambridge.org/core/service...

20.12.2024 16:54 โ€” ๐Ÿ‘ 71    ๐Ÿ” 19    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 3

Cool! Did their use of โ€œobject-orientedโ€ refer to the software or to the math? (Perhaps it is hard to disentangle those in this caseโ€ฆ)

10.12.2024 22:36 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

I really like the phrase โ€œobject-oriented statisticsโ€, which I think @stat110.bsky.social may have coined. Similar to that is โ€œmodular statisticsโ€ which Matthew Stephens likes to say.

10.12.2024 19:34 โ€” ๐Ÿ‘ 7    ๐Ÿ” 0    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 1

Come check out our posters at @neuripsconf.bsky.social this week!

Excited about two new works on optimization x fairness, read more below โฌ‡๏ธ

I wonโ€™t be there, but my co-authors will :)

09.12.2024 20:21 โ€” ๐Ÿ‘ 6    ๐Ÿ” 1    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

There must be a joke here involving tails, but I seem to be memoryless at the moment and unable to supply one

06.12.2024 03:42 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Preview
FBI Warns iPhone And Android Usersโ€”Stop Sending Texts US officials urge citizens to use encrypted messaging and calls wherever they canโ€”hereโ€™s what you need to know.

Calling all polarization researchersโ€ฆ

โ€œWhile messaging Android to Android or iPhone to iPhone is secure, messaging from one to the other is not.โ€

www.forbes.com/sites/zakdof...

06.12.2024 03:34 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Not keto friendly

06.12.2024 01:04 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Post image

Been a while since I made dinosaur sourdough #breadsky

05.12.2024 22:28 โ€” ๐Ÿ‘ 18    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

hi

04.12.2024 16:46 โ€” ๐Ÿ‘ 11    ๐Ÿ” 2    ๐Ÿ’ฌ 3    ๐Ÿ“Œ 0

Thank you Nature and @anilananth.bsky.social for this great feature on LLMs and AGI (and for highlighting our work arxiv.org/abs/2406.03689)

04.12.2024 16:59 โ€” ๐Ÿ‘ 10    ๐Ÿ” 4    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

๐Ÿ™‹๐Ÿผโ€โ™‚๏ธ

04.12.2024 16:30 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Correct

02.12.2024 14:06 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
An automated email that goes out every year from the machine learning conference NeurIPS, advising registrants to update their timezone on the conference website.

An automated email that goes out every year from the machine learning conference NeurIPS, advising registrants to update their timezone on the conference website.

Happy holidays to all who celebrate โ€œNeurIPS Update Your Timezoneโ€

02.12.2024 13:39 โ€” ๐Ÿ‘ 8    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

Save the Date!

It is our pleasure to share that the 2025 Midwest ML Symposium will be held at the University of Chicago, June 23-24, 2025!

Please stay tuned for further information about registration, accommodation, and transportation on the conference website midwest-ml.org/2025/

27.11.2024 14:58 โ€” ๐Ÿ‘ 17    ๐Ÿ” 9    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Seems like the biggest departure from assumptions is that there is no cost to setting up an account on both networks

24.11.2024 21:39 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Well itโ€™s still nice, Iโ€™m not complaining

24.11.2024 03:56 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

@aaronschein is following 20 prominent accounts