Pedro Santos's Avatar

Pedro Santos

@pedrosantospps.bsky.social

PhD student at @istecnico.bsky.social working on sequential decision-making and reinforcement learning. https://ppsantos.github.io/

17 Followers  |  37 Following  |  6 Posts  |  Joined: 15.01.2025  |  1.386

Latest posts by pedrosantospps.bsky.social on Bluesky

Post image Post image

Here’s some photos of GAIPS member @pedrosantospps.bsky.social presenting his work on ICML 2025 in Vancouver and EWRL 2025 in Tübingen, Germany. His poster was selected as a "spotlight poster" (top 2.6% of the papers)! 🙌 Read his work here: icml.cc/virtual/2025...

03.10.2025 14:39 — 👍 1    🔁 1    💬 0    📌 0

Walking around posters at @icmlconf.bsky.social, I was happy to see some buzz around convex RL—a topic I’ve worked on and strongly believe in.

Thought I’d share a few ICML papers on this direction. Let’s dive in👇

But first… what is convex RL?

🧵

1/n

24.07.2025 13:09 — 👍 5    🔁 1    💬 1    📌 1

The paper can be found here: arxiv.org/pdf/2409.15128

03.05.2025 08:46 — 👍 0    🔁 0    💬 0    📌 0

We provide lower and upper bounds on the mismatch between the finite and infinite trials formulations for GUMDPs, as well as empirical results to support our claims, highlighting how the number of trajectories and the structure of the underlying GUMDP influence policy evaluation.

03.05.2025 08:34 — 👍 0    🔁 0    💬 1    📌 0

We show that the number of trials plays a key role in infinite-horizon GUMDPs, and the expected performance of a given policy depends, in general, on the number of trials.

03.05.2025 08:34 — 👍 0    🔁 0    💬 1    📌 0

We contribute the first analysis on the impact of the number of trials, i.e., the number of randomly sampled trajectories, in infinite-horizon GUMDPs (considering both discounted and average formulations).

03.05.2025 08:34 — 👍 0    🔁 0    💬 1    📌 0

The general-utility Markov decision processes (GUMDPs) framework generalizes the MDPs framework by considering objective functions that depend on the frequency of visitation of state-action pairs induced by a given policy.

03.05.2025 08:34 — 👍 1    🔁 0    💬 1    📌 0
Post image

Happy to share that our paper "The Number of Trials Matters in Infinite-Horizon General-Utility Markov Decision Processes" got accepted as a spotlight poster at the International Conference on Machine Learning (ICML).

03.05.2025 08:34 — 👍 5    🔁 1    💬 2    📌 0

@pedrosantospps is following 20 prominent accounts