Maëva L'Hôtellier's Avatar

Maëva L'Hôtellier

@maevalhotellier.bsky.social

Studying learning and decision-making in humans | HRL team - ENS Ulm |

184 Followers  |  159 Following  |  14 Posts  |  Joined: 16.11.2024  |  1.7669

Latest posts by maevalhotellier.bsky.social on Bluesky

‼️New preprint‼️
There does not seem to be an effect of ghrelin on risky decision-making in probability discounting. Not in behaviour, underlying computational processes, or neural activity.
More details ⬇️

22.10.2025 08:11 — 👍 7    🔁 3    💬 2    📌 1

Interoception vs. Exteroception: Cardiac interoception competes with tactile perception, yet also facilitates self-relevance encoding https://www.biorxiv.org/content/10.1101/2025.06.25.660685v1

28.06.2025 00:15 — 👍 15    🔁 11    💬 0    📌 0
Post image

Lucky for you, lazy people at #RLDM2025, two of the best posters have apparently been put side-by-side: go check @maevalhotellier.bsky.social and @constancedestais.bsky.social posters!

11.06.2025 09:20 — 👍 31    🔁 4    💬 0    📌 0
OSF

🧵 New preprint out!
📄 "Elucidating attentional mechanisms underlying value normalization in human reinforcement learning"
👁️ We show that visual attention during learning causally shapes how values are encoded
w/ @sgluth.bsky.social & @stepalminteri.bsky.social
🔗 doi.org/10.31234/osf...

22.04.2025 16:57 — 👍 15    🔁 6    💬 1    📌 1
bioRxiv Manuscript Processing System Manuscript Processing System for bioRxiv.

🚨 New preprint on bioRxiv!

We investigated how the brain supports forward planning & structure learning during multi-step decision-making using fMRI 🧠

With A. Salvador, S. Hamroun, @mael-lebreton.bsky.social & @stepalminteri.bsky.social

📄 Preprint: submit.biorxiv.org/submission/p...

16.04.2025 10:03 — 👍 23    🔁 9    💬 2    📌 3
Preview
Neurons in auditory cortex integrate information within constrained temporal windows that are invariant to the stimulus context and information rate Much remains unknown about the computations that allow animals to flexibly integrate across multiple timescales in natural sounds. One key question is whether multiscale integration is accomplished by...

@magdalenasabat.bsky.social used 🔌 ephys to show that ferret auditory cortex neurons integrate sounds within fixed windows (~15–150 ms) that increase in non-primary auditory cortex, independent of information rate.
▶️ www.biorxiv.org/content/10.1...
#Neuroscience

17.02.2025 13:39 — 👍 2    🔁 2    💬 1    📌 1
Post image Post image

🎉 I'm excited to share that 2 of our papers got accepted to #RLDM2025!

📄 NORMARL: A multi-agent RL framework for adaptive social norms & sustainability.
📄 Selective Attention: When attention helps vs. hinders learning under uncertainty.

Grateful to my amazing co-authors! *-*

16.02.2025 16:52 — 👍 15    🔁 7    💬 1    📌 0
Post image

🚨 Finally out! My new @annualreviews.bsky.social in Psychology paper:
www.annualreviews.org/content/jour...
We unpack why psych theories of generalization keep cycling from rigid rule-based models to flexible similarity-based ones, then culminating in Bayesian hybrids. Let's break it down 👉 🧵

10.02.2025 14:46 — 👍 118    🔁 41    💬 4    📌 2
Post image

Epistemic biases in human reinforcement learning: behavioral evidence, computational characterization, normative status and possible applications.

A quite self-centered review, but with a broad introduction and conclusions and very cool figures.

Few main takes will follow

osf.io/preprints/ps...

23.01.2025 15:47 — 👍 38    🔁 19    💬 1    📌 0
OSF

Link to the preprint:
osf.io/preprints/ps...

10.12.2024 18:14 — 👍 2    🔁 0    💬 0    📌 0

Questions or thoughts? Let’s discuss!
Reach out — we’d love to hear from you! 🙌

10.12.2024 18:02 — 👍 0    🔁 0    💬 1    📌 0

Why does it matter? 🤔
Our work aims at bridging cognitive science and machine learning, showing how human-inspired principles like reward normalization can improve reinforcement learning AI systems!

10.12.2024 18:02 — 👍 2    🔁 0    💬 1    📌 0
Post image

What about Deep Decision Trees? 🌳
We further extend the RA model by integrating a temporal difference component to the dynamic range updates. With this extension, we demonstrate that the magnitude invariance capabilities of the RA model persist in multi-step tasks.

10.12.2024 18:02 — 👍 1    🔁 0    💬 1    📌 0
Post image

With this enhanced model, we generalize the main findings to other bandit settings: The dynamic RA model outperforms the ABS model in several bandit tasks with noisy outcomes, non-stationary rewards, and even multiple options.

10.12.2024 18:02 — 👍 1    🔁 0    💬 1    📌 0
Post image

Once these basic properties are demonstrated in a simplified set-up, we enhance the RA model to successfully cope with stochastic and volatile environments, by dynamically adjusting its internal range variables (Rmax / Rmin).

10.12.2024 18:02 — 👍 2    🔁 0    💬 1    📌 0
Post image

In contrast, the RA model, by constraining all rewards to a similar scale, efficiently balances exploration and exploitation without the need for task-specific adjustment!

10.12.2024 18:02 — 👍 2    🔁 1    💬 1    📌 0
Post image

Crucially, modifying the value of the temperature (𝛽) from the Softmax function does not solve the problem of the standard model. It simply shifts the peak performance along the magnitude axis.
Thus, to achieve high performance, the ABS model requires tuning the 𝛽 value to the magnitudes at stake.

10.12.2024 18:02 — 👍 1    🔁 0    💬 1    📌 0
Post image

Agent-Level Insights: ABS performance drops to chance due to over-exploration in small rewards and over-exploitation in large rewards.
In contrast, the RA model maintains a consistent, scale-invariant performance.

10.12.2024 18:02 — 👍 1    🔁 0    💬 1    📌 0
Post image

First, we simulate ABS and RA behavior in bandits tasks with various magnitude and discriminability levels.

As expected the standard model is highly dependent on the tasks levels, while the RA model achieves high accuracy over the whole range of values tested!

10.12.2024 18:02 — 👍 1    🔁 0    💬 1    📌 0
Post image

To avoid magnitude-dependence, we propose the Range-Adapted (RA) model: RA normalizes rewards, enabling consistent representation of subjective values within a constrained space, independent of reward magnitude.

10.12.2024 18:02 — 👍 1    🔁 1    💬 1    📌 0
Post image

Standard reinforcement learning algorithms encode rewards in an unbiased, absolute manner (ABS), which make their performance magnitude-dependent.

10.12.2024 18:02 — 👍 1    🔁 0    💬 1    📌 0

This work was done in collaboration with Jérémy Pérez, under the supervision of @stepalminteri.bsky.social 👥

Let's now dive into the study!

10.12.2024 18:02 — 👍 1    🔁 0    💬 1    📌 0

New preprint! 🚨

Performance of standard reinforcement learning (RL) algorithms depends on the scale of the rewards they aim to maximize.
Inspired by human cognitive processes, we leverage a cognitive bias to develop scale-invariant RL algorithms: reward range normalization.
Curious? Have a read!👇

10.12.2024 18:02 — 👍 14    🔁 7    💬 1    📌 1
OSF

🚨New preprint alert!🚨

Achieving Scale-Invariant Reinforcement Learning Performance with Reward Range Normalization.

Where we show that things we discover in psychology can be useful for machine learning.

By the amazing
@maevalhotellier.bsky.social and Jeremy Perez.
doi.org/10.31234/osf...

05.12.2024 15:22 — 👍 23    🔁 10    💬 1    📌 0

@maevalhotellier is following 20 prominent accounts