Nir Grinberg's Avatar

Nir Grinberg

@nirg.bsky.social

Assistant prof. at BGU in the field of Computational Social Science.

179 Followers  |  174 Following  |  20 Posts  |  Joined: 25.10.2023  |  2.2489

Latest posts by nirg.bsky.social on Bluesky

Can social media detect economic shocks before official data does?
A new PNAS Nexus study led by @nirg.bsky.social and Samuel Fraiberger shows that AI models tracking job-loss disclosures on social media can predict U.S. unemployment insurance claims up to two weeks early,

09.01.2026 12:33 โ€” ๐Ÿ‘ 0    ๐Ÿ” 1    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

Credit is also due to @davidlazer.bsky.social for prompting Sam & I to think about this problem 7(!) years ago ;)

12/fin

13.01.2026 12:49 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Kudos to my wonderful co-authors Do Lee linkedin.com/in/do-lee and
@manueltonneau.bsky.social (both on the job market!), Boris Sobol il.linkedin.com/in/boris-sobol and Sam Fraiberger samuelfraiberger.com.

11/N

13.01.2026 12:49 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

Yet platform data-access policies increasingly block this potential. Whether platforms or regulators will enable change in the coming years is a core policy question.

10/N

13.01.2026 12:49 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

There is clear public value here, potentially extending to other countries, especially where official statistical systems are under-developed.

9/N

13.01.2026 12:49 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

Why this matters?

Beyond forecasting, this approach can provide early warnings, surface local labor market stress hidden by national averages, and help flag measurement issues in real time.

8/N

13.01.2026 12:49 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

Key finding 3:

This also works at the state and city (!) level, including "holdout cities" where official UI numbers are sparse or irregularly updated.

As expected, accuracy scales with platform penetration and unemployment shocks.

7/N

13.01.2026 12:49 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

Key finding 2:

Our approach consistently outperforms industry consensus forecasts and can improve predictions of US UI claims up to two weeks ahead of official releases.

Thatโ€™s two weeks of additional lead time for policymakers.

6/N

13.01.2026 12:49 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

Key finding 1:

Capturing linguistic diversity matters.

Training LLMs with active learning lets us detect many more ways people talk about job loss, producing a far more representative sample of unemployed users than existing approaches.

5/N

13.01.2026 12:49 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Preview
Multilingual Detection of Personal Employment Status on Twitter Manuel Tonneau, Dhaval Adjodah, Joao Palotti, Nir Grinberg, Samuel Fraiberger. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2022.

We combine JoblessBERT (an encoder LLM developed in previous work aclanthology.org/2022.acl-lon... which detects ~3ร— more employment-related content without sacrificing precision) with post-stratification using inferred demographics to correct for platform bias.

4/N

13.01.2026 12:49 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

So we ask a hard question economic actors and policymakers rightly worry about:

Can skewed social media data be turned into trustworthy indicators of unemployment?

Can we produce robust predictions across geography โœ…, time โœ…, demography โœ…, and forecasting horizon โœ… ?

3/N

13.01.2026 12:49 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

Why this matters:

In March 2020, weekly unemployment insurance claims jumped from 278K to nearly 6 million in two weeks.

As official data lagged, policymakers were flying blind about where the shock was hitting and who was being affected.

2/N

13.01.2026 12:49 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Preview
Can social media reliably estimate unemployment? Abstract. Digital trace data hold tremendous potential for measuring policy-relevant outcomes in real-time, yet its reliability is often questioned. Here,

New paper out in @pnasnexus.org:

We show how skewed social media data can still be used to reliably estimate unemployment, not just nationally but down to the city level. ๐Ÿ“ˆ

doi.org/10.1093/pnas...

1/N

13.01.2026 12:49 โ€” ๐Ÿ‘ 11    ๐Ÿ” 4    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

Introducing โ€œDomainDemo: a dataset of domain-sharing activities among different demographic groups on Twitter.โ€

Today, we release five derived metrics of over 129,000 domains, quantifying their characteristics such as geographical reach and audience partisanship.

1/3

17.01.2025 15:40 โ€” ๐Ÿ‘ 15    ๐Ÿ” 5    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 4
Close to Human-Level Agreement: Tracing Journeys of Violent Speech in Incel Posts with GPT-4-Enhanced Annotations

Close to Human-Level Agreement: Tracing Journeys of Violent Speech in Incel Posts with GPT-4-Enhanced Annotations

Figure 1: Linear Regression between time and share of violent posts.

Figure 1: Linear Regression between time and share of violent posts.

Figure 2: Linear Regression between time and category of directedness.

Figure 2: Linear Regression between time and category of directedness.

Incels (involuntarily celibates) are increasingly using violent language, particularly non-directed violent language in the largest incel forum, finds @danielmatter.bsky.social @miriamschirmer.bsky.social @nirg.bsky.social @jurgenpfeffer.bsky.social arxiv.org/abs/2401.02001

17.01.2024 17:34 โ€” ๐Ÿ‘ 12    ๐Ÿ” 8    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 2

Awesome! Weโ€™d love to hear what you and your students think about it.

05.12.2023 15:43 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

We are also grateful for comments received on earlier versions of this work from Diyi Liu, Eran Amsalem @patyrossini.bsky.social Alon Zoizner, and @orentsur.bsky.social & for funding from European Research Council (ERC), Israel Science Foundation (ISF) and BGU's Data Science Center.

05.12.2023 09:28 โ€” ๐Ÿ‘ 4    ๐Ÿ” 1    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Big shout-out to the people whose work enabled this research, including @sdmccabe.com @jongreen.bsky.social @davidlazer.bsky.social Magdalena Wojcieszak @jatucker.bsky.social Subhayan Mukerjee @ylelkes.bsky.social @kthorson.bsky.social @chriswells.bsky.social (pls tag others if missing).

6/

05.12.2023 09:08 โ€” ๐Ÿ‘ 6    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Sociodemographic characteristics among different political exposure types. Sample averages are marked in a gray dashed line. Ninety-five percent bootstrapped CIs are shown (mostly occluded due to their small size). CI = confidence interval.

Sociodemographic characteristics among different political exposure types. Sample averages are marked in a gray dashed line. Ninety-five percent bootstrapped CIs are shown (mostly occluded due to their small size). CI = confidence interval.

Finally, looking at the demographic composition of consumption "types", we find that the media-oriented clusters (exc. superconsumers) have older individuals, more women, and more registered Democrats.

5/

05.12.2023 08:46 โ€” ๐Ÿ‘ 2    ๐Ÿ” 1    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

Even when putting aside the more extreme "media superconsumers", the two media-oriented clusters (which are ~20% of the population), get half or more of their political content *directly* from media organizations and journalists, without any mediation from peers.

4/

05.12.2023 08:44 โ€” ๐Ÿ‘ 2    ๐Ÿ” 1    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
The composition of political exposure across clusters. The share of politics curated by different actor types (y-axis) across clusters (x-axis). Darker-colored bars represent direct exposure to media organizations, journalists, politicians, OLs, and social peers. Lighter-colored bars represent indirect exposure to media organizations, journalists, politicians, or opinion leaders through social peers. OL = opinion leader.

The composition of political exposure across clusters. The share of politics curated by different actor types (y-axis) across clusters (x-axis). Darker-colored bars represent direct exposure to media organizations, journalists, politicians, OLs, and social peers. Lighter-colored bars represent indirect exposure to media organizations, journalists, politicians, or opinion leaders through social peers. OL = opinion leader.

Americans also vary in the breakdown of actors that populate their feeds, but interestingly, the bulk of the population gets half or more of their political exposure from *traditional sources*โ€”media organizations, journalists, and politicians.

3/

05.12.2023 08:44 โ€” ๐Ÿ‘ 2    ๐Ÿ” 1    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Prototypical types of individual political exposure. Each point in panel (A) represents the political exposure of a single panel member, reduced to two dimensions using the UMAP algorithm, and colored by the cluster assignment obtained from HDBSCAN. Panel (B) shows the median number of political tweets available to individuals per day (left bars), and their percentage out of all tweets available to them on Twitter (right bars). Cluster labels and their share in the population are specified on the x-axis. Colors are consistent between the two figure panels. Ninety-five percent bootstrapped CIs are omitted from the figure due to their small magnitude, which are upper bounded by twenty-seven exposures to tweets and 0.28โ€‰percent, respectively. OL = opinion leader; CI = confidence interval; UMAP = Uniform Manifold Approximation and Projection.

Prototypical types of individual political exposure. Each point in panel (A) represents the political exposure of a single panel member, reduced to two dimensions using the UMAP algorithm, and colored by the cluster assignment obtained from HDBSCAN. Panel (B) shows the median number of political tweets available to individuals per day (left bars), and their percentage out of all tweets available to them on Twitter (right bars). Cluster labels and their share in the population are specified on the x-axis. Colors are consistent between the two figure panels. Ninety-five percent bootstrapped CIs are omitted from the figure due to their small magnitude, which are upper bounded by twenty-seven exposures to tweets and 0.28โ€‰percent, respectively. OL = opinion leader; CI = confidence interval; UMAP = Uniform Manifold Approximation and Projection.

People's political feeds mostly map onto 8 distinct types that vary in the amount of politics they get, both in absolute #'s and as % the feed as a whole. Still, for nearly 90% of the population, about 1 in 12 posts from their network are political. Quite an engaged public!

2/

05.12.2023 08:42 โ€” ๐Ÿ‘ 3    ๐Ÿ” 1    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

๐ŸšจNew paper๐Ÿšจ out in the International Journal of Press/Politics w/ Assaf Shamir and @jenny-oser.bsky.social ๐ŸŽ‰

Here's what we learned from studying the composition of political content available to 600k+ registered U.S. voters on Twitter during the 2020 election.

doi.org/10.1177/1940...
๐Ÿงต๐Ÿ‘‡

05.12.2023 08:38 โ€” ๐Ÿ‘ 20    ๐Ÿ” 11    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 2

@nirg is following 20 prominent accounts