Rushiv Arora @rushivarora - Bluesky Profile

Rushiv has been doing this neat work trying to understand how to bring language into multi-task learning.

09.10.2025 13:34 — 👍 6 🔁 2 💬 0 📌 0

Special thanks to @eugenevinitsky.bsky.social for being an amazing mentor!

08.10.2025 15:19 — 👍 0 🔁 0 💬 0 📌 0

2. LEXPOL’s performance benchmarks against previous methods on MetaWorld
3. A combination of LEXPOL with the previous natural language based state embedding algorithm, giving a joint method combing state and action factorization

08.10.2025 15:19 — 👍 0 🔁 0 💬 1 📌 0

Paper Highlights:
1. Qualitative analysis of LEXPOL (end-to-end learning) and frozen pre-trained single-task policies. We note that LEXPOL successfully disentangles the tasks into fundamental skills, and learns to combine them without a decomposition to primitive actions.

08.10.2025 15:19 — 👍 1 🔁 0 💬 1 📌 0

LEXPOL is inspired by our ability to combine multiple different sub-skills together to solve larger tasks based on context. It works by factorizing the complexity of multi-task reinforcement learning into smaller learnable pieces.

08.10.2025 15:19 — 👍 2 🔁 0 💬 1 📌 0

This helps since multi-task RL is hard because a single monolithic policy must entangle many skills. LEXPOL factorizes control into smaller learnable pieces and uses language as the router for composition.

08.10.2025 15:19 — 👍 1 🔁 0 💬 1 📌 0

The idea: give the agent a natural-language task description (“push the green button”) and let a learned language gate blend or select among several sub-policies (skills) based on context. One shared state; multiple policies; a gating MLP guided by language embeddings chooses the action.

08.10.2025 15:19 — 👍 2 🔁 0 💬 1 📌 0

Multi-Task Reinforcement Learning with Language-Encoded Gated Policy Networks Multi-task reinforcement learning often relies on task metadata -- such as brief natural-language descriptions -- to guide behavior across diverse objectives. We present Lexical Policy Networks (LEXPO...

I’m very excited to share my new paper introducing LEXical POLicy Networks (LEXPOL) for Multi-Task Reinforcement Learning!  

In LEXPOL, language acts as a gate that routes among reusable sub-policies (skills) to solve diverse tasks.
Paper: arxiv.org/abs/2510.06138

08.10.2025 15:19 — 👍 11 🔁 2 💬 2 📌 1

Excited to share that an abstract of this paper was accepted at RLDM!

I’m genuinely excited to go for the conference—it is one of my favorite venues for interdisciplinary discussions!

19.02.2025 03:31 — 👍 1 🔁 0 💬 0 📌 0

This is a cause that is very close to my heart, and I am glad to have found the NOCC. They do a lot of important work for patients and their caregivers.

12.02.2025 03:21 — 👍 0 🔁 0 💬 0 📌 0

My grandmother passed away from ovarian cancer in 1995. I never got the chance to meet her, but I have always felt extremely connected with her through the countless stories and family heirlooms my parents have shared with me.

12.02.2025 03:21 — 👍 0 🔁 0 💬 1 📌 0

2025 TCS New York City Marathon - Rushiv Arora My grandmother passed away from ovarian cancer in 1995. I never got the chance to meet her, but I have always felt extremely connected with her, and all my ancestors, through the countless stories and...

I am thrilled to announce that I am running the 2025 New York City Marathon for the National Ovarian Cancel Coalition.

I would be grateful if you would consider donating or sharing the message to help me fundraise for this very important cause! Every share counts!

p2p.onecause.com/nycmarathon2...

12.02.2025 03:21 — 👍 1 🔁 0 💬 1 📌 0

This paper is personally meaningful to me since it is my first solo-authored paper! (And 7th overall!).  I got the idea in 2023 in the final months of my master’s but didn’t have the chance to work on it until last year. I am thrilled to finally publish it!

13.01.2025 16:04 — 👍 0 🔁 0 💬 0 📌 0

Read the paper to explore how H-UVFAs advance scalable and reusable skills in RL! #ReinforcementLearning #MachineLearning #AI

13.01.2025 16:04 — 👍 0 🔁 0 💬 1 📌 0

- Outperforming UVFAs: In Hierarchical settings, H-UVFAs have superior performance and generalization than UVFAs. In fact, UVFAs failed to learn in some settings. 
- Learning in both supervised and reinforcement learning contexts.

13.01.2025 16:04 — 👍 1 🔁 0 💬 1 📌 0

Core Contributions:
- Hierarchical Embeddings: We show that it is possible to break down hierarchical value functions into its core elements by leveraging higher-order decomposition methods in Mathematics like Tucker Decompositions. 
- Zero-shot generalization: H-UVFAs can extrapolate to new goals!

13.01.2025 16:04 — 👍 0 🔁 0 💬 1 📌 0

We extend Universal Value Function Approximators (UVFAs) to hierarchical RL, enabling zero-shot generalization across new goals in multi-task settings while retaining the benefits of temporal abstraction.

13.01.2025 16:04 — 👍 0 🔁 0 💬 1 📌 0

Hierarchical Universal Value Function Approximators There have been key advancements to building universal approximators for multi-goal collections of reinforcement learning value functions -- key elements in estimating long-term returns of states in a...

First post on here and it’s an exciting one! I am happy to share my new paper “Hierarchical Universal Value Function Approximators” (H-UVFAs): arxiv.org/abs/2410.08997

13.01.2025 16:04 — 👍 5 🔁 1 💬 1 📌 1

Rushiv Arora

Latest posts by rushivarora.bsky.social on Bluesky

@rushivarora is following 20 prominent accounts