Becca Cohen's Avatar

Becca Cohen

@beccacohen.bsky.social

IS PhD Student at UIUC studying digital humanities, language, cultural analytics and ethical AI

414 Followers  |  469 Following  |  14 Posts  |  Joined: 17.09.2023
Posts Following

Posts by Becca Cohen (@beccacohen.bsky.social)

Preview
Find your next read in this dataset of international bestsellers. In most situations when I say “I need the data,” I’m referring to gossip, and it’s less of a “need” than what some would call a “messy curiosity.” But recently, I came across a Substack post analyz…

Come “poke around” our data “to find new books and authors [you’ve] never heard of!” @jamesfolta.com makes great use of the Int'l Bestsellers dataset built by @sdileonardi.bsky.social, @beccacohen.bsky.social, & @dan-sinnamon.bsky.social on @literaryhub.bsky.social. lithub.com/find-your-ne...

24.02.2026 16:10 — 👍 24    🔁 11    💬 0    📌 3
Preview
Find your next read in this dataset of international bestsellers. In most situations when I say “I need the data,” I’m referring to gossip, and it’s less of a “need” than what some would call a “messy curiosity.” But recently, I came across a Substack post analyz…

Compiled Sean DiLeonardi, Becca Cohen, and Dan Sinykin, the International Bestsellers Dataset gathers data about international bestselling books from 2013 to 2022.

21.02.2026 20:01 — 👍 15    🔁 6    💬 0    📌 0
Preview
Find your next read in this dataset of international bestsellers. In most situations when I say “I need the data,” I’m referring to gossip, and it’s less of a “need” than what some would call a “messy curiosity.” But recently, I came across a Substack post analyz…

Stoked to see @jamesfolta.com in @literaryhub.bsky.social talking up a dataset on int'l bestsellers built by @sdileonardi.bsky.social + @beccacohen.bsky.social (I helped). See who the world reads!

One nit: Folta sez "Sinykin is a great follow on Bluesky"—LIES lithub.com/find-your-ne...

17.02.2026 20:33 — 👍 18    🔁 5    💬 3    📌 3
Preview
Publishing flows (no domestic) A Flourish data visualization by cody

When we published international bestseller data with @post45data.bsky.social all I wanted was to make a Sankey diagram but never managed, now someone has. And it looks beautiful. Check it out!

Source: substack.com/home/post/p-...
public.flourish.studio/visualisatio...

11.02.2026 22:18 — 👍 9    🔁 4    💬 0    📌 0

Excited to see the dataset being used!!

11.02.2026 21:23 — 👍 7    🔁 3    💬 0    📌 0

"I want statisticians and data scientists to be more honest and explicit about the rhetorical role of statistical inference. No one in introductory statistics tells you that they are teaching a language of persuasion with numbers."

26.01.2026 17:57 — 👍 33    🔁 6    💬 5    📌 0

Ok this sounds cool books will be written by robots evaluated by robots and sold by robots to robots!

11.01.2026 19:01 — 👍 16    🔁 6    💬 1    📌 0
Screenshot of the top of the paper "The undervaluing of elite women in physics",
Abstract: "Elite women in physics wait longer than
men for recognition. Once elected to the US National Academy of Sciences, however, their prominence surges — evidence that their work was undervalued all along."
Body: "Physics has changed dramatically over the past 50 years, both in the range of phenomena it studies and its demographic composition. For example, the share of publishing women physicists worldwide has more than doubled — from 4.5% in 1970 to 9.2% in 2020 — and continues to increase. Yet parity remains distant, raising the question: to what degree has the physics community progressed toward the meritocratic ideal of recognizing scientists purely on the basis of their contributions?

We address this question in two ways. First, we compare broad trends in the productivity and prominence of men and women physi- cists worldwide over the past 50 years. Second, we examine the careers of elite physicists, asking whether the high honour of being elected to the US National Academy of Sciences (NAS) impacts the careers of men and women in physics differently. Our analysis leverages a recently developed Bayesian network model1 to estimate individual measures of scientific productivity and prominence from large-scale bibliographic data. We apply this model to a global physics collaboration network constructed from first- and last-author pairs in 9.1 million physics journal articles (omitting single-author papers) published between 1950 and 2023 and recorded in the OpenAlex bibliographic database. Within this global network, we focus on 93,456 established physicists with at least 30 years between their first and most recent publication and with at least 10 first- or last-author publications..."

Screenshot of the top of the paper "The undervaluing of elite women in physics", Abstract: "Elite women in physics wait longer than men for recognition. Once elected to the US National Academy of Sciences, however, their prominence surges — evidence that their work was undervalued all along." Body: "Physics has changed dramatically over the past 50 years, both in the range of phenomena it studies and its demographic composition. For example, the share of publishing women physicists worldwide has more than doubled — from 4.5% in 1970 to 9.2% in 2020 — and continues to increase. Yet parity remains distant, raising the question: to what degree has the physics community progressed toward the meritocratic ideal of recognizing scientists purely on the basis of their contributions? We address this question in two ways. First, we compare broad trends in the productivity and prominence of men and women physi- cists worldwide over the past 50 years. Second, we examine the careers of elite physicists, asking whether the high honour of being elected to the US National Academy of Sciences (NAS) impacts the careers of men and women in physics differently. Our analysis leverages a recently developed Bayesian network model1 to estimate individual measures of scientific productivity and prominence from large-scale bibliographic data. We apply this model to a global physics collaboration network constructed from first- and last-author pairs in 9.1 million physics journal articles (omitting single-author papers) published between 1950 and 2023 and recorded in the OpenAlex bibliographic database. Within this global network, we focus on 93,456 established physicists with at least 30 years between their first and most recent publication and with at least 10 first- or last-author publications..."

Out now in @natphys.nature.com "The undervaluing of elite women in physics", with @weihuali.bsky.social and H Zheng, we show how election into prestigious academic societies has markedly different effects on the research prominence of women and men physicists /1
www.nature.com/articles/s41...

12.12.2025 15:25 — 👍 39    🔁 17    💬 2    📌 3

Making a CLEAN, SHAREABLE dataset is fucking hard! I'm super proud, then, to publish this one, on a team led by @sdileonardi.bsky.social and @beccacohen.bsky.social, with @post45data.bsky.social. It has more than a decade of 21C int'l bestseller data, revealing how popular world lit circulates....

29.07.2025 16:24 — 👍 36    🔁 9    💬 1    📌 1

Hoo boy! I can’t believe it’s happening. We’ve been working on this for years. Thanks to @post45data.bsky.social you can now search our IB database, with a very cool interface!

Please share with anyone who might be interested

Special thanks to @ninasabak.bsky.social for aiding with the data source

29.07.2025 15:24 — 👍 21    🔁 8    💬 1    📌 0

A dataset long in the making!! Excited to see it finally available!

29.07.2025 16:11 — 👍 4    🔁 0    💬 0    📌 0
Video thumbnail

New dataset on bestsellers from 40+ countries, with consistent coverage for France, Germany, Spain, Italy, and the U.S.

Congrats to the authors @sdileonardi.bsky.social, @beccacohen.bsky.social, and @dan-sinnamon.bsky.social on this major contribution! 🎉

🔗: doi.org/10.18737/386...

29.07.2025 14:49 — 👍 40    🔁 23    💬 1    📌 9

Go Patrick! :)

29.07.2025 14:16 — 👍 1    🔁 0    💬 0    📌 0
Post image

🎉 New Benchmark Alert: KRISTEVA – Close‑Reading for LLMs📚

I’m excited to announce a new paper accepted to ACL 2025, in collaboration with Patrick Sui, Philippe Laban, and others!

27.07.2025 19:19 — 👍 21    🔁 11    💬 2    📌 1

Ominous?

14.05.2025 11:04 — 👍 0    🔁 0    💬 0    📌 0
Post image

“Let me out mom! I wanna sit right next to the window” -Kodiak

30.03.2025 18:59 — 👍 0    🔁 0    💬 0    📌 0

I’m going to need Mother Nature to knock it off with all these tornadoes.

How am I supposed to write my dissertation while I’m stressing out about corralling my cats into the basement?

30.03.2025 18:56 — 👍 3    🔁 1    💬 1    📌 0
Post image Post image

Thanks Apple AI, was looking for somewhere I could get germs on sale (second image is the actual promotion notification it was summarizing)

16.02.2025 21:23 — 👍 4    🔁 0    💬 0    📌 0

I need to hear this right now, which means that others probably do too:

However you're getting through this, you're doing a good job. This month's been fucked up and intense and I know I'm not the only one who's overwhelmed. It's exhausting and cruel, and the important part is getting through 🩵

07.02.2025 17:50 — 👍 2215    🔁 498    💬 53    📌 14
Post image

10 Days to MLA! 12 Truisms About the Novel Debunked. Day 3: The author is a solitary creator. To find out what's NOT true, read @dan-sinnamon.bsky.social , @sdileonardi.bsky.social + @beccacohen.bsky.social in a special issue of Studies in the Novel: muse-jhu-edu.oregonstate.idm.oclc.org/issue/54034

30.12.2024 20:05 — 👍 27    🔁 8    💬 0    📌 1
Project MUSE - Studies in the Novel-Volume 56, Number 4, Winter 2024

& a link to the issue muse.jhu.edu/issue/54034

some contributors on bluesky who put the 🔥🔥🔥🔥🔥 in "firewall" include @krbarrett.bsky.social @3volumenovel.bsky.social @sdileonardi.bsky.social @beccacohen.bsky.social @dan-sinnamon.bsky.social @kavithaganesan76.bsky.social @dallasliddle.bsky.social

28.12.2024 23:50 — 👍 10    🔁 2    💬 0    📌 0

If Stephen King is a brand name, what’s the diff bw him and more recent authors who embrace self-branding? It was more difficult to answer this than I first thought

It was a pleasure getting here with my coauthors (& the ed.s & Post45 2023 crew). Please read and let us know what you think

19.12.2024 22:41 — 👍 11    🔁 4    💬 0    📌 0
Project MUSE - Brand Management: International Bestsellers and the Death of the Author, Again

With @sdileonardi.bsky.social and @beccacohen.bsky.social, I wrote a shortish essay on The Girl on the Train and how self-branding is changing authorship. For a fascinating special issue of Studies in the Novel ed by @megaplex.bsky.social and @sarahdallison.bsky.social muse.jhu.edu/pub/1/articl...

19.12.2024 18:48 — 👍 31    🔁 7    💬 0    📌 1

Idk you look identical to me 🤷‍♀️

07.12.2024 03:24 — 👍 2    🔁 0    💬 0    📌 0

Relatable

06.12.2024 18:26 — 👍 2    🔁 0    💬 0    📌 0

Maybe one of the reasons there are so many unhinged academics is that in order to keep your job, you’re constantly required to tout yourself as a Scholar of World-Historical Significance, even as your office’s ceiling tiles slowly dislodge themselves to fall on your head.

06.12.2024 15:12 — 👍 4841    🔁 684    💬 128    📌 105

Studies in the Novel!

04.12.2024 20:42 — 👍 1    🔁 0    💬 0    📌 0

Post the last sentence of your last article
OUT THIS MONTH! w @dan-sinnamon.bsky.social @beccacohen.bsky.social

The convergence of these two developments-Web 2.0 and globalized conglomerate publishing-with all their extra-literary ramifications, has transformed the author [into a brand manager].

04.12.2024 20:39 — 👍 5    🔁 3    💬 1    📌 0

Same here please! Also UIUC PhD at ischool doing digital humanities work :)

03.12.2024 19:42 — 👍 0    🔁 0    💬 0    📌 0
Abstract: Measures of textual similarity and divergence are increasingly used to study cultural change. But which measures align, in practice, with social evidence about change? We apply three different representations of text (topic models, document embeddings, and word-level perplexity) to three different corpora (literary studies, economics, and fiction). In every case, works by highly-cited authors and younger authors are textually ahead of the curve. We don't find clear evidence that one representation of text is to be preferred over the others. But alignment with social evidence is strongest when texts are represented through the top quartile of passages, suggesting that a text's impact may depend more on its most forward-looking moments than on sustaining a high level of innovation throughout.

Abstract: Measures of textual similarity and divergence are increasingly used to study cultural change. But which measures align, in practice, with social evidence about change? We apply three different representations of text (topic models, document embeddings, and word-level perplexity) to three different corpora (literary studies, economics, and fiction). In every case, works by highly-cited authors and younger authors are textually ahead of the curve. We don't find clear evidence that one representation of text is to be preferred over the others. But alignment with social evidence is strongest when texts are represented through the top quartile of passages, suggesting that a text's impact may depend more on its most forward-looking moments than on sustaining a high level of innovation throughout.

There are many ways to identify texts that seem ahead of their time. Our CHR 2024 paper asks which measures of textual precocity align best with social evidence about influence and change.

26.11.2024 21:25 — 👍 40    🔁 7    💬 1    📌 2