Thanks for the kind mention @nclairis.bsky.social ! And congrats to the authors – really interesting to see mood being explored across species
30.04.2025 17:18 — 👍 2 🔁 0 💬 0 📌 0@romanececchi.bsky.social
Postdoc in the Human Reinforcement Learning team led by @stepalminteri.bsky.social at École Normale Supérieure (ENS) in Paris ✨
Thanks for the kind mention @nclairis.bsky.social ! And congrats to the authors – really interesting to see mood being explored across species
30.04.2025 17:18 — 👍 2 🔁 0 💬 0 📌 09/9 🙌 Huge thanks to @sgluth.bsky.social & @stepalminteri.bsky.social!
📄 Read the full preprint: doi.org/10.31234/osf...
💬 Feedback and discussion welcome!
#reinforcementlearning #attention #computationalneuroscience #eyetracking
🧵/end
8/9 🎯 Conclusion
Attention doesn’t just follow value – it shapes it.
We showed that:
👁️ Gaze during learning causally biases value encoding
⏱️ Stimulus-directed attention sets the stage for processing the outcomes
🧩 Our model offers a mechanistic account of why mid-value options are undervalued in RL
7/9 ⏱️ When does attention matter?
🏆 Best fit? The stimulus-only model – consistent across experiments.
➕ Complemented by fine-grained gaze analysis:
👉 Value computation relies mainly on fixations during stimulus presentation, with minimal contribution from outcome-related fixations.
6/9 🧩 Modeling attention in value learning
We formalized these findings in an attentional range model, where visual fixations modulate absolute value before range normalization.
We tested 3 versions using:
• Only stimulus fixations
• Only outcome fixations
• A weighted combination
5/9 ⬆️ Exp 2 & 3: Bottom-up manipulation
We used saliency to guide gaze to the mid-value option:
• Exp 2: Salient stimulus → higher valuation in transfer
• Exp 3: Salient outcome → no significant effect
👉 Only attention to stimuli during learning influenced value formation
4/9 ⬇️ Exp 1: Top-down manipulation
We made the high-value option unavailable on some trials – forcing attention to the mid-value one.
Result: participants looked more at the mid-value option – and later valued it more.
👉 Attention during learning altered subjective valuation.
3/9 Design
We ran 3️⃣ eye-tracking RL experiments, combining:
• A 3-option learning phase with full feedback
• A transfer test probing generalization & subjective valuation
Crucially, we manipulated attention via:
• Top-down control (Exp 1)
• Bottom-up saliency (Exp 2 & 3)
2/9 💡 Hypothesis
Could attention be the missing piece?
Inspired by the work of @krajbichlab.bsky.social, @yaelniv.bsky.social, @thorstenpachur.bsky.social and others, we asked:
👉 Does where we look during learning causally shape how we encode value?
1/9 🎨 Background
When choosing, people don’t evaluate options in isolation – they normalize values to context.
This holds in RL... but in three-option settings, people undervalue the mid-value option – something prior models fail to explain (see @sophiebavard.bsky.social).
❓Why the distortion?
🧵 New preprint out!
📄 "Elucidating attentional mechanisms underlying value normalization in human reinforcement learning"
👁️ We show that visual attention during learning causally shapes how values are encoded
w/ @sgluth.bsky.social & @stepalminteri.bsky.social
🔗 doi.org/10.31234/osf...
Epistemic biases in human reinforcement learning: behavioral evidence, computational characterization, normative status and possible applications. 
A quite self-centered review, but with a broad introduction and conclusions and very cool figures. 
Few main takes will follow
osf.io/preprints/ps...
New preprint! 🚨
Performance of standard reinforcement learning (RL) algorithms depends on the scale of the rewards they aim to maximize. 
Inspired by human cognitive processes, we leverage a cognitive bias to develop scale-invariant RL algorithms: reward range normalization. 
Curious? Have a read!👇
🚨New preprint alert!🚨
Achieving Scale-Invariant Reinforcement Learning Performance with Reward Range Normalization. 
Where we show that things we discover in psychology can be useful for machine learning.  
By the amazing 
@maevalhotellier.bsky.social and Jeremy Perez. 
doi.org/10.31234/osf...