Johannes Wachs's Avatar

Johannes Wachs

@johanneswachs.bsky.social

Researching social computing, crowds, and networks at Corvinus University of Budapest and HUN-REN CERS. More at: https://johanneswachs.com/

202 Followers  |  275 Following  |  19 Posts  |  Joined: 25.09.2023  |  2.5249

Latest posts by johanneswachs.bsky.social on Bluesky

Preview
Will AI Choke Off the Supply of Knowledge? More people turn to ChatGPT and other large language models for answers, but they don’t add to the stock of knowledge.

In a recent piece for the @wsj.com commentator @greg_ip cited a 2024 PNAS Nexus study by @johanneswachs.bsky.social + coauthors @maria-drc.bsky.social & N.Laurentsyeva showing that LLM can be a potential substitute for human-generated data & knowledge πŸ“š

www.wsj.com/tech/ai/will...

18.09.2025 06:44 β€” πŸ‘ 2    πŸ” 2    πŸ’¬ 0    πŸ“Œ 0
Preview
Heat, health, and habitats: analyzing the intersecting risks of climate and demographic shifts in Austrian districts Scientific Reports - Heat, health, and habitats: analyzing the intersecting risks of climate and demographic shifts in Austrian districts

In light of yet another scorching summer β˜€οΈ, new research by H. Schuster, A. Polleres, A. Anjomshoaa & @johanneswachs.bsky.social reveals how climate change 🌑️ + demographic aging πŸ“ˆ intersect to shape health risks across Austrian districts.

Read the full story πŸ‘‰ rdcu.be/eD5XV

04.09.2025 09:59 β€” πŸ‘ 3    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0
Preview
Who is using AI to code? Global diffusion and impact of generative AI Generative coding tools promise big productivity gains, but uneven uptake could widen skill and income gaps. We train a neural classifier to spot AI-generated Python functions in 80 million GitHub com...

β€œOur conservative model finds that going from 0β†’30 % AI share (US 2020-24) predicts 2.4 % increase in commits. Using task & wage data on occupations, this implies genAI creates $9-14 bill/year in US software alone. Larger estimates of effects from RCTs imply $100 billion.β€œ

arxiv.org/abs/2506.08945

12.06.2025 20:47 β€” πŸ‘ 2    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0
Post image Post image Post image

Lots of neat stuff in this paper showing 30% of US python commits use AI

As of the end of 2024: β€œthe annual value of AI-assisted coding in the United States at $9.6βˆ’14.4 billion, rising to 64βˆ’96 billion if we assume higher estimates of productivity effects reported by randomized control trials”

12.06.2025 17:00 β€” πŸ‘ 47    πŸ” 13    πŸ’¬ 4    πŸ“Œ 2
Post image Post image Post image

How much code now comes from AI? With @simonedaniotti.bsky.social, @xfeng.bsky.social & Frank Neffke we estimate that by end-2024 30% of Python functions pushed by US devs on GitHub are AI-generated. Adoption is rapid but diffusion lags globally. How did we do it? arxiv.org/abs/2506.08945

11.06.2025 20:23 β€” πŸ‘ 27    πŸ” 13    πŸ’¬ 1    πŸ“Œ 1
Preview
Who is using AI to code? Global diffusion and impact of generative AI Generative coding tools promise big productivity gains, but uneven uptake could widen skill and income gaps. We train a neural classifier to spot AI-generated Python functions in 80 million GitHub com...

Thanks to my coauthors for an interesting collaboration. Here's the preprint again:
arxiv.org/abs/2506.08945
Comments welcome!

11.06.2025 20:23 β€” πŸ‘ 1    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0

Our conservative model finds that going from 0β†’30 % AI share (US 2020-24) predicts 2.4 % increase in commits. Using task & wage data on occupations, this implies genAI creates $9-14 bill/year in US software alone. Larger estimates of effects from RCTs imply upwards of $100 billion in value / year.

11.06.2025 20:23 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Besides the adoption results, we find newer devs take up AI fastest. We see no gender gap. In fixed-effects models, higher user AI share predicts more commits, and the use of novel code libraries and library pairs. AI extends capabilities and supports exploration.

11.06.2025 20:23 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

The resulting classifier scores an out-of-sample AUC of 0.96. We applied it to 80 million commit snapshots from 2019-24, spanning tens of thousands of public repos and developers, to track how the share of AI-authored code evolves over time and across countries.

11.06.2025 20:23 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

First we built an AI-code detector & gathered data to train it. Human code came from 2018 Python functions & HumanEval 21/23. To create AI-written code examples we had one LLM describe each human example in English then a 2nd LLM coded that description.

11.06.2025 20:23 β€” πŸ‘ 3    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0
Post image Post image Post image

How much code now comes from AI? With @simonedaniotti.bsky.social, @xfeng.bsky.social & Frank Neffke we estimate that by end-2024 30% of Python functions pushed by US devs on GitHub are AI-generated. Adoption is rapid but diffusion lags globally. How did we do it? arxiv.org/abs/2506.08945

11.06.2025 20:23 β€” πŸ‘ 27    πŸ” 13    πŸ’¬ 1    πŸ“Œ 1
Preview
The dynamics of leadership and success in software development teams - Nature Communications Understanding how team dynamics impact performance in collaborative environments remains an open question. Here, authors use fine-grained activity data from software projects to characterize team evol...

New paper out in Nature Communications πŸ“Œ
@anetilab.bsky.social members @lgajo.bsky.social and @johanneswachs.bsky.social with @loreb92.bsky.social and Federico Battiston explore "The dynamics of leadership and success in software development teams" πŸ‘Ύ

Link to paper ➑️ www.nature.com/articles/s41...

29.04.2025 11:42 β€” πŸ‘ 7    πŸ” 3    πŸ’¬ 0    πŸ“Œ 0
Preview
The dynamics of leadership and success in software development teams - Nature Communications Understanding how team dynamics impact performance in collaborative environments remains an open question. Here, authors use fine-grained activity data from software projects to characterize team evol...

Happy to see this finally out in @natcomms.nature.com! πŸŽ‰

"The dynamics of leadership and success in software development teams"
www.nature.com/articles/s41...

Amazing collaboration with @lgajo.bsky.social, @johanneswachs.bsky.social and @fede7j.bsky.social

28.04.2025 11:07 β€” πŸ‘ 20    πŸ” 11    πŸ’¬ 0    πŸ“Œ 0
Post image

πŸŽ‰ New publication in PNAS: Urban highways are barriers to social ties
www.pnas.org/doi/10.1073/...

We illustrate from numerous aspects that highways are physical barriers that cut opportunities for social connectionsβ€”in the 50 largest metropolitan areas in the US.

05.03.2025 10:21 β€” πŸ‘ 9    πŸ” 2    πŸ’¬ 1    πŸ“Œ 1
Map showing a highway section in red and social ties in space crossing the highway. Wherever a tie crosses the highway, there is a cross. There are 94 crosses.

Map showing a highway section in red and social ties in space crossing the highway. Wherever a tie crosses the highway, there is a cross. There are 94 crosses.

πŸŽ‰ New paper in PNAS: Urban highways are barriers to social ties
https://www.pnas.org/doi/10.1073/pnas.2408937122

Highways are barriers that cut opportunities for social ties. We quantify this effect by overlaying the US highway network with millions of social ties from Twitter.

05.03.2025 07:56 β€” πŸ‘ 219    πŸ” 126    πŸ’¬ 10    πŸ“Œ 15
Stylized map of Detroit (MI) showing the highway network, and the network of social connections between urban residents. The connections intersecting highways are sparser than elsewehere. Image credit Karo Berghuber (Insta: @kariot.lines)

Stylized map of Detroit (MI) showing the highway network, and the network of social connections between urban residents. The connections intersecting highways are sparser than elsewehere. Image credit Karo Berghuber (Insta: @kariot.lines)

"Urban Highways Are Barriers to Social Ties" out on PNAS!
The 1st large-scale measure of how highways weaken social connections between the communities they separate. This barrier effect is strong in the 50 largest US cities--especially for low-income Black communities.
www.pnas.org/doi/10.1073/...

05.03.2025 07:08 β€” πŸ‘ 114    πŸ” 40    πŸ’¬ 4    πŸ“Œ 4
3 photos of researchers

3 photos of researchers

Urban highways are barriers to social connections, report by @itu.dk :
en.itu.dk/About-ITU/Pr...

05.03.2025 10:58 β€” πŸ‘ 17    πŸ” 2    πŸ’¬ 0    πŸ“Œ 1
Preview
The consequences of generative AI for online knowledge communities - Scientific Reports Scientific Reports - The consequences of generative AI for online knowledge communities

As for other platforms, this paper by @gburtch.bsky.social and colleagues looks at Reddit and SO. The SO results are similar to ours, but they find that activity on Reddit didn't change much.
www.nature.com/articles/s41...

14.02.2025 20:36 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

The published version of that preprint has a slightly longer descriptive time series in the discussion, see below. We can't extend the counterfactual (comparing SO vs Russian and Chinese platforms) because other LLMs came out.

academic.oup.com/pnasnexus/ar...

14.02.2025 20:36 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
Large language models reduce public knowledge sharing on online Q&A platforms Abstract. Large language models (LLMs) are a potential substitute for human-generated data and knowledge resources. This substitution, however, can present

23/250 is Large language models reduce public knowledge sharing on online Q&A platforms

This makes me wonder if people are answering Stack Overflow questions with ChatGPT answers . . .

31.01.2025 22:09 β€” πŸ‘ 3    πŸ” 2    πŸ’¬ 1    πŸ“Œ 0

🧡πŸ§ͺ New paper alert! We studied how firms consume information by analyzing online reading patterns across millions of organizations. Some fascinating patterns emerged... (1/7)

07.11.2024 10:18 β€” πŸ‘ 14    πŸ” 3    πŸ’¬ 1    πŸ“Œ 1
Post image

AI for science could be more impactful than chatbots. It is already helping win Nobel prizes and accelerating drug development and materials discovery.
Today we published an essay about it: why it matters, how it’s happening and its implications. Here is a summary from an econ / social sci lens.

26.11.2024 10:39 β€” πŸ‘ 79    πŸ” 30    πŸ’¬ 2    πŸ“Œ 7
Post image

New preprint on innovation in OSS /w Gabor Meszaros: arxiv.org/abs/2411.14894

We extract library import statements from Stack Overflow posts in 12 languages. These elementary building blocks of code appear at a slower rate as ecosystems grow. But novel combos of libraries grow linearly.

25.11.2024 12:20 β€” πŸ‘ 4    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0
Post image

We also find that most novel library imports and combinations are made by less-experienced users, suggesting how important new blood is for long-run ecosystem health.

Feedback warmly welcome!

25.11.2024 12:20 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

[Mirrors results on over 200 years of novelties in US patents by Youn et al: royalsocietypublishing.org/doi/full/10.... ].

Two implications for maintenance:
- single libraries will be widely used as ecosystems grow (see plot)
- the many co-used libraries need to stay compatible with each other

25.11.2024 12:20 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

New preprint on innovation in OSS /w Gabor Meszaros: arxiv.org/abs/2411.14894

We extract library import statements from Stack Overflow posts in 12 languages. These elementary building blocks of code appear at a slower rate as ecosystems grow. But novel combos of libraries grow linearly.

25.11.2024 12:20 β€” πŸ‘ 4    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0

@koren.mk @gaborbekes.bsky.social

18.11.2024 21:17 β€” πŸ‘ 5    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image Post image

Our paper on the effect of ChatGPT on activity on @stackoverflow.com.web.brid.gy is out: academic.oup.com/pnasnexus/ar...

@maria-drc.bsky.social, Nadzeya Laurentsyeva & I find a 25% decrease in activity on SO within 6 months of #ChatGPT 's release vs counterfactuals.

Why does it matter?

15.11.2024 09:28 β€” πŸ‘ 54    πŸ” 24    πŸ’¬ 3    πŸ“Œ 3

But the most important take away is that a major public source of data is rapidly shrinking. Ironically, future AI systems will miss a valuable source of data to learn from. We discuss this and implications for competition in AI and search - have a look!

15.11.2024 09:28 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

In the paper we also find
- heterogeneities across programming languages
- no change in post quality
and we present supporting evidence from the SO User Survey.

We also observe further decline in posting past the point where our counterfactuals are valid.

15.11.2024 09:28 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

@johanneswachs is following 20 prominent accounts