Ben Jonathan Wagner's Avatar

Ben Jonathan Wagner

@benjwagner.bsky.social

I´m a postdoc with Tobias Hauser @Developmental Computational Psychiatry lab and Peter Dayan @mpicybernetics.bsky.social #Dopamine #DecisionMaking #ReinforcementLearning #ActiveInference #IntertemporalChoice #BrainExplorerApp

236 Followers  |  546 Following  |  28 Posts  |  Joined: 03.10.2023  |  2.4887

Latest posts by benjwagner.bsky.social on Bluesky

Preview
Interacting cortico-basal ganglia-thalamocortical loops shape behavioral control through cognitive maps and shortcuts Control of behavior is often explained in terms of a dichotomy, with distinct neural circuits underlying goal-directed and habitual control, yet accumulating evidence suggests these processes are deep...

I think this paper is a good read from a neuroscience perspective on how repetition and goal-directed reward learning are intertwined, resulting in short-cuts that correspond to decision biases as a function of task complexity. www.cell.com/trends/neuro...

12.12.2025 10:42 — 👍 3    🔁 0    💬 0    📌 0

I would agree that this is an argument. However, a "flat curve" could also be the equilibrium between repetition and exploration (because deciding between arbitrary symbols without feedback is quite boring and participants start to explore). But sure, one needs to investigate...

08.12.2025 10:44 — 👍 1    🔁 0    💬 0    📌 0

In those tasks the Range model is not qualitatively different from REL. But sure, one can look at this. In general the ABS model is not bad in some datasets but repetitions become more important as task complexity increases. I wouldn´t say that the signature in (c) is captured well by ABS though.

08.12.2025 10:31 — 👍 1    🔁 0    💬 0    📌 0

I am not sure if I understand this point correctly, but we also tested the model on six datasets without transfer feedback.

07.12.2025 15:27 — 👍 0    🔁 0    💬 1    📌 0

I would argue that if the preference mechanism is rather associative, likely due to some form of policy compression and/or WM. In free choice, this is related to agency; however, in observational learning, this can happen without agency (e.g., via strengthening an associative context-stimulus link).

07.12.2025 15:26 — 👍 0    🔁 0    💬 1    📌 0

Dear Stefano, I appreciate the discussion. We actually present true ex-novo simulations for some tasks in the supplement of the paper. Those simulations show that when we directly compare normalization vs. repetition (via task design), the results match our empirical findings very well.

07.12.2025 15:23 — 👍 0    🔁 0    💬 1    📌 0

Finally..., thank you :)

05.12.2025 12:30 — 👍 0    🔁 0    💬 0    📌 0

Out now in Translational Psychiatry! www.nature.com/articles/s41...

28.11.2025 14:53 — 👍 42    🔁 20    💬 0    📌 1

Many thanks also to @stepalminteri.bsky.social, @sophiebavard.bsky.social, and @gjocham.bsky.social for being helpful and for promptly answering all the questions I had.

27.11.2025 18:47 — 👍 2    🔁 0    💬 1    📌 0

As a side note, I would not interpret our results as showing that relative value learning (or specific forms thereof) does not exist, but rather that it may not be the primary force behind preference biases in such tasks.

27.11.2025 18:46 — 👍 0    🔁 0    💬 1    📌 0

We therefore believe that, in the end, repetition may be a more important factor in shaping choice than previously acknowledged.

27.11.2025 18:46 — 👍 0    🔁 0    💬 1    📌 0

Of course, the idea of repetition biases is not new in RL, but to our knowledge it has not yet been shown that such a mechanism can sufficiently and consistently account for such preference biases across a range of value-learning tasks.

27.11.2025 18:46 — 👍 1    🔁 0    💬 1    📌 0

Conceptually, I think this aligns very well with work on policy compression by @lucylai.bsky.social and @gershbrain.bsky.social and recent work by @annecollins.bsky.social.

27.11.2025 18:45 — 👍 1    🔁 0    💬 1    📌 0

Notably even when Q-value differences or objective absolute or relative value differences were absent. Overall, the impact of this repetition mechanism is larger in more complex tasks.

27.11.2025 18:45 — 👍 3    🔁 0    💬 1    📌 0

This holds both in standard analyses and in hierarchical Bayesian modeling, and importantly in settings where the two accounts make divergent predictions. We also found that post-task valuation ratings show that participants rated stimuli higher when they had been chosen more often.

27.11.2025 18:43 — 👍 1    🔁 0    💬 1    📌 0
Preview
Action repetition biases choice in context-dependent decision-making - Communications Psychology This study shows that decision biases previously attributed to value normalization (e.g. relative value learning or range normalization) are better explained by action repetition. Repeating an action ...

Very happy that this is out www.nature.com/articles/s44.... Together with @stefankiebel.bsky.social we show that decision biases in context-dependent decision making, previously attributed to different forms of value normalization, are very well explained by habit-like action repetition.

27.11.2025 18:42 — 👍 39    🔁 12    💬 1    📌 2
Preview
A habit and working memory model as an alternative account of human reward-based learning Nature Human Behaviour - In this study, Collins proposes an alternative dual-process (working memory and habit) model of reinforcement learning in humans.

My paper is out!
Computational modeling of error patterns during reward-based learning show evidence that habit learning (value free!) supplements working memory in 7 human data sets.
rdcu.be/eQjLN

17.11.2025 17:18 — 👍 132    🔁 49    💬 2    📌 3
Post image

A new preprint 📝 with @tobiasuhauser.bsky.social @kenzakdr.bsky.social @benjwagner.bsky.social
and Andrew Webb accompanying our cpm-toolbox.net python modelling library - including details about our motivations, toolbox features, framework and workflows!

👉 osf.io/preprints/ps...

16.09.2025 12:37 — 👍 8    🔁 4    💬 0    📌 2
Preview
Release v0.23.18 - New Prospect Models, improved parameter management, and a few bug fixes · DevComPsy/cpm Install You can install the new release straight from the PyPi repository: pip install cpm-toolbox Added Add input validation and error handling in all cpm.optimisation.minimise methods Add test u...

New models (three variants of Prospect Theory), new features (more ways to manage parameters, more model components to use), and of course bug fixes. If you want to make your computational modelling reproducible and robust, check out and install the new version of *cpm*:

github.com/DevComPsy/cp...

03.09.2025 08:32 — 👍 8    🔁 3    💬 0    📌 1

I'm wondering, do you use chatgpt or other ai tools at all? Or do you use them in a "critical way"?

11.07.2025 17:32 — 👍 1    🔁 0    💬 1    📌 0

@stefankiebel.bsky.social

17.06.2025 08:35 — 👍 0    🔁 0    💬 0    📌 0
Preview
Cognitive computational model reveals repetition bias in a sequential decision-making task - Communications Psychology Using a sequential decision making task and cognitive modeling, we show that human decisions are best explained by a combination of repetition bias and goal directed reward-based behavior.

Using a sequential decision making task and cognitive modelling, this study shows that human decisions are best explained by a combination of repetition bias and goal directed reward-based behavior.
@benjwagner.bsky.social
www.nature.com/articles/s44...

16.06.2025 06:51 — 👍 12    🔁 4    💬 1    📌 0
Preview
Cognitive computational model reveals repetition bias in a sequential decision-making task - Communications Psychology Using a sequential decision making task and cognitive modeling, we show that human decisions are best explained by a combination of repetition bias and goal directed reward-based behavior.

www.nature.com/articles/s44...

15.06.2025 10:21 — 👍 3    🔁 0    💬 0    📌 0

can you post everything over here? thank you!

18.11.2024 20:08 — 👍 1    🔁 0    💬 0    📌 0

Just deactivated my X account.

16.11.2024 13:39 — 👍 2    🔁 0    💬 0    📌 0

Participants enhanced or inhibited their habitual responses based on whether they were congruent or incongruent with goal-directed behavior.
Using drift-diffusion modeling, we found that habitual and goal-directed response tendencies interact on the level of evidence accumulation (drift-rate).

11.10.2024 14:06 — 👍 1    🔁 0    💬 0    📌 0

We discovered that the influence of a habit isn’t static, it depends on the number of repetitions of an action sequence.
🧠 Approximately 60% of participants adaptively adjusted their habitual responses according to the task context.

11.10.2024 14:04 — 👍 1    🔁 0    💬 1    📌 0
Preview
Context-Dependent Interaction Between Goal-Directed and Habitual Control Under Time Pressure Habits are an important aspect of human behaviour. Habits are reflexive, inflexible, and fast, in contrast to goal-directed behaviour which is reflective, flexible, and slow. Current theories assume t...

In our new preprint @saschafrolich.bsky.social , @MichaelSmolka & @StefanKiebel on how "habits interact with goal-directed behavior under time pressure", we found that habitual behavior varies as a function of context and repetition 🔄 www.biorxiv.org/content/10.1...

11.10.2024 13:58 — 👍 4    🔁 0    💬 1    📌 0
Preview
Chronic Deep Brain Stimulation of the Human Nucleus Accumbens Region Disrupts the Stability of Inter... When choosing between rewards that differ in temporal proximity (intertemporal choice), human preferences are typically stable, constituting a clinically relevant transdiagnostic trait. Here we show, ...

Oh, thank you. It seems it is not working. This one should: www.jneurosci.org/content/43/4...

11.11.2023 19:09 — 👍 1    🔁 0    💬 0    📌 0

4/4 However, further research is needed to clarify a causal link.⚡

06.11.2023 15:35 — 👍 0    🔁 0    💬 0    📌 0

@benjwagner is following 20 prominent accounts