‼️New preprint‼️
There does not seem to be an effect of ghrelin on risky decision-making in probability discounting. Not in behaviour, underlying computational processes, or neural activity.
More details ⬇️
@maevalhotellier.bsky.social
Studying learning and decision-making in humans | HRL team - ENS Ulm |
‼️New preprint‼️
There does not seem to be an effect of ghrelin on risky decision-making in probability discounting. Not in behaviour, underlying computational processes, or neural activity.
More details ⬇️
Interoception vs. Exteroception: Cardiac interoception competes with tactile perception, yet also facilitates self-relevance encoding https://www.biorxiv.org/content/10.1101/2025.06.25.660685v1
28.06.2025 00:15 — 👍 15 🔁 11 💬 0 📌 0Lucky for you, lazy people at #RLDM2025, two of the best posters have apparently been put side-by-side: go check @maevalhotellier.bsky.social and @constancedestais.bsky.social posters!
11.06.2025 09:20 — 👍 31 🔁 4 💬 0 📌 0🧵 New preprint out!
📄 "Elucidating attentional mechanisms underlying value normalization in human reinforcement learning"
👁️ We show that visual attention during learning causally shapes how values are encoded
w/ @sgluth.bsky.social & @stepalminteri.bsky.social
🔗 doi.org/10.31234/osf...
🚨 New preprint on bioRxiv!
We investigated how the brain supports forward planning & structure learning during multi-step decision-making using fMRI 🧠
With A. Salvador, S. Hamroun, @mael-lebreton.bsky.social & @stepalminteri.bsky.social 
📄 Preprint: submit.biorxiv.org/submission/p...
@magdalenasabat.bsky.social used 🔌 ephys to show that ferret auditory cortex neurons integrate sounds within fixed windows (~15–150 ms) that increase in non-primary auditory cortex, independent of information rate.
▶️ www.biorxiv.org/content/10.1...
#Neuroscience
🎉 I'm excited to share that 2 of our papers got accepted to #RLDM2025!
📄 NORMARL: A multi-agent RL framework for adaptive social norms & sustainability.
📄 Selective Attention: When attention helps vs. hinders learning under uncertainty.
Grateful to my amazing co-authors! *-*
🚨 Finally out! My new @annualreviews.bsky.social in Psychology paper:
www.annualreviews.org/content/jour...
We unpack why psych theories of generalization keep cycling from rigid rule-based models to flexible similarity-based ones, then culminating in Bayesian hybrids. Let's break it down 👉 🧵
Epistemic biases in human reinforcement learning: behavioral evidence, computational characterization, normative status and possible applications. 
A quite self-centered review, but with a broad introduction and conclusions and very cool figures. 
Few main takes will follow
osf.io/preprints/ps...
Link to the preprint:
osf.io/preprints/ps...
Questions or thoughts? Let’s discuss! 
Reach out — we’d love to hear from you! 🙌
Why does it matter? 🤔
Our work aims at bridging cognitive science and machine learning, showing how human-inspired principles like reward normalization can improve reinforcement learning AI systems!
What about Deep Decision Trees? 🌳
We further extend the RA model by integrating a temporal difference component to the dynamic range updates. With this extension, we demonstrate that the magnitude invariance capabilities of the RA model persist in multi-step tasks.
With this enhanced model, we generalize the main findings to other bandit settings: The dynamic RA model outperforms the ABS model in several bandit tasks with noisy outcomes, non-stationary rewards, and even multiple options.
10.12.2024 18:02 — 👍 1 🔁 0 💬 1 📌 0Once these basic properties are demonstrated in a simplified set-up, we enhance the RA model to successfully cope with stochastic and volatile environments, by dynamically adjusting its internal range variables (Rmax / Rmin).
10.12.2024 18:02 — 👍 2 🔁 0 💬 1 📌 0In contrast, the RA model, by constraining all rewards to a similar scale, efficiently balances exploration and exploitation without the need for task-specific adjustment!
10.12.2024 18:02 — 👍 2 🔁 1 💬 1 📌 0Crucially, modifying the value of the temperature (𝛽) from the Softmax function does not solve the problem of the standard model. It simply shifts the peak performance along the magnitude axis. 
Thus, to achieve high performance, the ABS model requires tuning the 𝛽 value to the magnitudes at stake.
Agent-Level Insights: ABS performance drops to chance due to over-exploration in small rewards and over-exploitation in large rewards. 
In contrast, the RA model maintains a consistent, scale-invariant performance.
First, we simulate ABS and RA behavior in bandits tasks with various magnitude and discriminability levels. 
As expected the standard model is highly dependent on the tasks levels, while the RA model achieves high accuracy over the whole range of values tested!
To avoid magnitude-dependence, we propose the Range-Adapted (RA) model: RA normalizes rewards, enabling consistent representation of subjective values within a constrained space, independent of reward magnitude.
10.12.2024 18:02 — 👍 1 🔁 1 💬 1 📌 0Standard reinforcement learning algorithms encode rewards in an unbiased, absolute manner (ABS), which make their performance magnitude-dependent.
10.12.2024 18:02 — 👍 1 🔁 0 💬 1 📌 0This work was done in collaboration with Jérémy Pérez, under the supervision of @stepalminteri.bsky.social 👥
Let's now dive into the study!
New preprint! 🚨
Performance of standard reinforcement learning (RL) algorithms depends on the scale of the rewards they aim to maximize. 
Inspired by human cognitive processes, we leverage a cognitive bias to develop scale-invariant RL algorithms: reward range normalization. 
Curious? Have a read!👇
🚨New preprint alert!🚨
Achieving Scale-Invariant Reinforcement Learning Performance with Reward Range Normalization. 
Where we show that things we discover in psychology can be useful for machine learning.  
By the amazing 
@maevalhotellier.bsky.social and Jeremy Perez. 
doi.org/10.31234/osf...