Alexey Koshevoy's Avatar

Alexey Koshevoy

@alexeykoshevoy.bsky.social

PhD student at Laboratoire de Psychologie Cognitive, AMU & Institut Jean Nicod, ENS-PSL | Interested in how communication shapes languages https://alexeykosh.github.io

268 Followers  |  402 Following  |  40 Posts  |  Joined: 01.10.2023  |  2.262

Latest posts by alexeykoshevoy.bsky.social on Bluesky

thanks a lot Martin!

05.10.2025 16:41 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

thanks a lot Oleg!!

30.09.2025 13:09 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

thanks Mason!!

26.09.2025 11:13 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

thanks a lot Natalia!

25.09.2025 23:07 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

thanks Jakub! everyone has enjoyed our paper

25.09.2025 17:07 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Huge congratulations to @alexeykoshevoy.bsky.social l who defended his PhD thesis today!! with his co-supervisor @sblldtrch.bsky.social and jury members @simonkirby.bsky.social @gboleda.bsky.social Paula Rubio Fernandez & Benjamin Spector.

25.09.2025 16:42 β€” πŸ‘ 17    πŸ” 2    πŸ’¬ 5    πŸ“Œ 0
We present our new preprint titled "Large Language Model Hacking: Quantifying the Hidden Risks of Using LLMs for Text Annotation".
We quantify LLM hacking risk through systematic replication of 37 diverse computational social science annotation tasks.
For these tasks, we use a combined set of 2,361 realistic hypotheses that researchers might test using these annotations.
Then, we collect 13 million LLM annotations across plausible LLM configurations.
These annotations feed into 1.4 million regressions testing the hypotheses. 
For a hypothesis with no true effect (ground truth $p > 0.05$), different LLM configurations yield conflicting conclusions.
Checkmarks indicate correct statistical conclusions matching ground truth; crosses indicate LLM hacking -- incorrect conclusions due to annotation errors.
Across all experiments, LLM hacking occurs in 31-50\% of cases even with highly capable models.
Since minor configuration changes can flip scientific conclusions, from correct to incorrect, LLM hacking can be exploited to present anything as statistically significant.

We present our new preprint titled "Large Language Model Hacking: Quantifying the Hidden Risks of Using LLMs for Text Annotation". We quantify LLM hacking risk through systematic replication of 37 diverse computational social science annotation tasks. For these tasks, we use a combined set of 2,361 realistic hypotheses that researchers might test using these annotations. Then, we collect 13 million LLM annotations across plausible LLM configurations. These annotations feed into 1.4 million regressions testing the hypotheses. For a hypothesis with no true effect (ground truth $p > 0.05$), different LLM configurations yield conflicting conclusions. Checkmarks indicate correct statistical conclusions matching ground truth; crosses indicate LLM hacking -- incorrect conclusions due to annotation errors. Across all experiments, LLM hacking occurs in 31-50\% of cases even with highly capable models. Since minor configuration changes can flip scientific conclusions, from correct to incorrect, LLM hacking can be exploited to present anything as statistically significant.

🚨 New paper alert 🚨 Using LLMs as data annotators, you can produce any scientific result you want. We call this **LLM Hacking**.

Paper: arxiv.org/pdf/2509.08825

12.09.2025 10:33 β€” πŸ‘ 259    πŸ” 94    πŸ’¬ 5    πŸ“Œ 19
Image of labubu doll labeled labubu next to image of spiky labubu doll labeled lakiki

Image of labubu doll labeled labubu next to image of spiky labubu doll labeled lakiki

::slowly stands while clapping::

10.09.2025 23:07 β€” πŸ‘ 1014    πŸ” 257    πŸ’¬ 6    πŸ“Œ 11

Congrats, well deserved!

04.09.2025 14:56 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
A global database on blowguns with links to geography and language | Evolutionary Human Sciences | Cambridge Core A global database on blowguns with links to geography and language - Volume 7

New paper! ⚑ With Gabriel Aguirre and Marcelo SÑnchez, looking at patterns of blowgun types and use across societies of the world. We find areal patterns, similarities mediated by cultural connections, and specific types characterizing distinct branches of the Austronesian language tree. 🎯

27.08.2025 21:37 β€” πŸ‘ 22    πŸ” 8    πŸ’¬ 0    πŸ“Œ 0

I am on a 6 hour train journey without air conditioning, but it’s worth it because I am heading to #SLE2025! This is my first linguistics conference in a while.

25.08.2025 14:32 β€” πŸ‘ 5    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Models as Prediction Machines: How to Convert Confusing Coefficients into Clear Quantities

Abstract
Psychological researchers usually make sense of regression models by interpreting coefficient estimates directly. This works well enough for simple linear models, but is more challenging for more complex models with, for example, categorical variables, interactions, non-linearities, and hierarchical structures. Here, we introduce an alternative approach to making sense of statistical models. The central idea is to abstract away from the mechanics of estimation, and to treat models as β€œcounterfactual prediction machines,” which are subsequently queried to estimate quantities and conduct tests that matter substantively. This workflow is model-agnostic; it can be applied in a consistent fashion to draw causal or descriptive inference from a wide range of models. We illustrate how to implement this workflow with the marginaleffects package, which supports over 100 different classes of models in R and Python, and present two worked examples. These examples show how the workflow can be applied across designs (e.g., observational study, randomized experiment) to answer different research questions (e.g., associations, causal effects, effect heterogeneity) while facing various challenges (e.g., controlling for confounders in a flexible manner, modelling ordinal outcomes, and interpreting non-linear models).

Models as Prediction Machines: How to Convert Confusing Coefficients into Clear Quantities Abstract Psychological researchers usually make sense of regression models by interpreting coefficient estimates directly. This works well enough for simple linear models, but is more challenging for more complex models with, for example, categorical variables, interactions, non-linearities, and hierarchical structures. Here, we introduce an alternative approach to making sense of statistical models. The central idea is to abstract away from the mechanics of estimation, and to treat models as β€œcounterfactual prediction machines,” which are subsequently queried to estimate quantities and conduct tests that matter substantively. This workflow is model-agnostic; it can be applied in a consistent fashion to draw causal or descriptive inference from a wide range of models. We illustrate how to implement this workflow with the marginaleffects package, which supports over 100 different classes of models in R and Python, and present two worked examples. These examples show how the workflow can be applied across designs (e.g., observational study, randomized experiment) to answer different research questions (e.g., associations, causal effects, effect heterogeneity) while facing various challenges (e.g., controlling for confounders in a flexible manner, modelling ordinal outcomes, and interpreting non-linear models).

Figure illustrating model predictions. On the X-axis the predictor, annual gross income in Euro. On the Y-axis the outcome, predicted life satisfaction. A solid line marks the curve of predictions on which individual data points are marked as model-implied outcomes at incomes of interest. Comparing two such predictions gives us a comparison. We can also fit a tangent to the line of predictions, which illustrates the slope at any given point of the curve.

Figure illustrating model predictions. On the X-axis the predictor, annual gross income in Euro. On the Y-axis the outcome, predicted life satisfaction. A solid line marks the curve of predictions on which individual data points are marked as model-implied outcomes at incomes of interest. Comparing two such predictions gives us a comparison. We can also fit a tangent to the line of predictions, which illustrates the slope at any given point of the curve.

A figure illustrating various ways to include age as a predictor in a model. On the x-axis age (predictor), on the y-axis the outcome (model-implied importance of friends, including confidence intervals).

Illustrated are 
1. age as a categorical predictor, resultings in the predictions bouncing around a lot with wide confidence intervals
2. age as a linear predictor, which forces a straight line through the data points that has a very tight confidence band and
3. age splines, which lies somewhere in between as it smoothly follows the data but has more uncertainty than the straight line.

A figure illustrating various ways to include age as a predictor in a model. On the x-axis age (predictor), on the y-axis the outcome (model-implied importance of friends, including confidence intervals). Illustrated are 1. age as a categorical predictor, resultings in the predictions bouncing around a lot with wide confidence intervals 2. age as a linear predictor, which forces a straight line through the data points that has a very tight confidence band and 3. age splines, which lies somewhere in between as it smoothly follows the data but has more uncertainty than the straight line.

Ever stared at a table of regression coefficients & wondered what you're doing with your life?

Very excited to share this gentle introduction to another way of making sense of statistical models (w @vincentab.bsky.social)
Preprint: doi.org/10.31234/osf...
Website: j-rohrer.github.io/marginal-psy...

25.08.2025 11:49 β€” πŸ‘ 941    πŸ” 283    πŸ’¬ 49    πŸ“Œ 19

Congrats!

07.07.2025 14:59 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Experimentology cover: title and curves for distributions.

Experimentology cover: title and curves for distributions.

Experimentology is out today!!! A group of us wrote a free online textbook for experimental methods, available at experimentology.io - the idea was to integrate open science into all aspects of the experimental workflow from planning to design, analysis, and writing.

01.07.2025 18:25 β€” πŸ‘ 534    πŸ” 228    πŸ’¬ 10    πŸ“Œ 15
Post image

Want to easily scrape data from news media sites?

There's an R package for that!

paperboy

"paperboy offers writers of web scrap[ers] a clear path to publish their code & earn co-authorship on the package, while deliver[ing] news media data from many websites in a consistent format."

26.06.2025 13:46 β€” πŸ‘ 122    πŸ” 35    πŸ’¬ 6    πŸ“Œ 2

For some reason bluesky doesn’t work on Firefox anymore, event after I updated it. Is it the case for anyone else?

11.06.2025 09:50 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
The pivot penalty in research - Nature An analysis of millions of scientific papers and patents reveals a β€˜pivot penalty’ when researchers shift direction, with the impact of studies decreasing rapidly the further they move from their prev...

:: (not really) Trigger warning::

Stay in your lane, or pay a career penalty

www.nature.com/articles/s41...

31.05.2025 04:49 β€” πŸ‘ 49    πŸ” 14    πŸ’¬ 9    πŸ“Œ 4
Post image

Language depends on copying (e.g. of words, signs). And language in turn is needed for many other things.

When and why did our ancestors gain this ability to copy? Our (Ron, Elisa & me) archaeological reanalysis says: in the last million years. Just out:

dx.doi.org/10.1093/oxfo...

26.05.2025 10:32 β€” πŸ‘ 44    πŸ” 16    πŸ’¬ 2    πŸ“Œ 0
Preview
Corpus-based approaches to evolutionary dynamics in language AbstractPragmatic-interactional aspects of present-day language use as well as historical language change have come to be regarded as an important source o

🚨New publication alert!
"Corpus-based approaches to evolutionary dynamics in language" (w/ @stefanhartmann.bsky.social) out now in the in the Oxford Handbook of Approaches to Language Evolution. Kudos to @limorraviv.bsky.social & @cedricboeckx.bsky.social for putting this great volume together!

25.05.2025 15:56 β€” πŸ‘ 25    πŸ” 9    πŸ’¬ 2    πŸ“Œ 1
A Natural Language Interface to ggplot2 The ggplot2 package is the state-of-the-art toolbox for creating and formatting graphs. However, it is easy to forget how certain formatting commands are named and sometimes users find themselves aski...

you know all gen 1 pokΓ©mon, of course you could.
reminded of brandmaier.github.io/ggx/

23.05.2025 10:19 β€” πŸ‘ 12    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0
Post image Post image Post image

How are humans able to make sense of time? Not with special biology but with β€œtime tools”—ideas, practices, and artifacts that render time more concrete.

My new paper explores this vast, varied toolkitβ€”one that makes use of knots, nuts, hands, flowers, mountains, shadows, and much more.

(link πŸ‘‡)

02.05.2025 16:51 β€” πŸ‘ 89    πŸ” 34    πŸ’¬ 3    πŸ“Œ 1
Video thumbnail

Out now in @pnas.org! 🌹Is a rose by any other name still as roselike?🌹

We study the prevalence of iconicity (does a word look/sound like what it means?) and systematicity (are pronunciation/meaning relationships shared across multiple words?) in large datasets of ASL, English, and Spanish.

🧡1/N

23.04.2025 17:47 β€” πŸ‘ 18    πŸ” 9    πŸ’¬ 1    πŸ“Œ 3
Preview
Knowledge transmission, culture and the consequences of social disruption in wild elephants | Philosophical Transactions of the Royal Society B: Biological Sciences Cultural knowledge is widely presumed to be important for elephants. In all three elephant species, individuals tend to congregate around older conspecifics, creating opportunities for social transmission. However, direct evidence of social learning and ...

Knowledge transmission, culture and the consequences of social disruption in wild elephants royalsocietypublishing.org/doi/10.1098/...

05.05.2025 12:28 β€” πŸ‘ 17    πŸ” 11    πŸ’¬ 0    πŸ“Œ 1
Preview
Carbon majors and the scientific case for climate liability - Nature A transparent and reproducible scientific framework is introduced to formalize how trillions in economic losses are attributable to the extreme heat caused by emissions from fossil fuel companies, whi...

"Emissions linked to Chevron, the highest-emitting investor-owned company in our data, for example, very likely caused between US $791 billion and $3.6 trillion in heat-related losses over the period 1991–2020"

www.nature.com/articles/s41...

23.04.2025 19:37 β€” πŸ‘ 259    πŸ” 145    πŸ’¬ 6    πŸ“Œ 13

Apply for our PhD position in language acquisition / computational linguistics in Groningen until 24 April! Job ad is here:
www.rug.nl/about-ug/wor...

26.03.2025 17:22 β€” πŸ‘ 5    πŸ” 4    πŸ’¬ 0    πŸ“Œ 1
Psych-DS A specification for psychological datasets. JSON metadata, predictable directory structure, and machine-readable specifications for tabular datasets.

Psych-DS is (1) spellcheck for your datasets and (2) a pathway to standardizing data in our academic fields that *everyone* can learn.

And it's live RIGHT NOW!

psych-ds.github.io

(This is the announcement post I've been leading up to)

09.04.2025 19:37 β€” πŸ‘ 133    πŸ” 60    πŸ’¬ 9    πŸ“Œ 12
Preview
A Dataset on Linguistic Connectivity Across and Within Countries - Scientific Data Scientific Data - A Dataset on Linguistic Connectivity Across and Within Countries

This looks useful -- A Dataset on Linguistic Connectivity Across and Within Countries
#linguistics #languages
www.nature.com/articles/s41...

08.04.2025 20:33 β€” πŸ‘ 18    πŸ” 5    πŸ’¬ 1    πŸ“Œ 0
Post image

New paper on misperceptions out in PNAS @pnas.org

www.pnas.org/doi/10.1073/...

Why do people overestimate the size of politically relevant groups (immigrant, LGBTQ, Jewish) and quantities (% of budget spent on foreign aid, % of refugees that are criminals)?πŸ§΅πŸ‘‡

07.04.2025 12:00 β€” πŸ‘ 270    πŸ” 98    πŸ’¬ 12    πŸ“Œ 21
Preview
Extensive compositionality in the vocal system of bonobos Compositionality, the capacity to combine meaningful elements into larger meaningful structures, is a hallmark of human language. Compositionality can be trivial (the combination’s meaning is the sum ...

Game changing study in @science.org by @berthetmelissa.bsky.social and co.

www.science.org/doi/10.1126/...

03.04.2025 21:50 β€” πŸ‘ 17    πŸ” 9    πŸ’¬ 1    πŸ“Œ 2

@alexeykoshevoy is following 20 prominent accounts