Rushiv Arora's Avatar

Rushiv Arora

@rushivarora.bsky.social

ML Research Scientist at Dell AI by day, RL Researcher at night https://rushivarora.github.io

145 Followers  |  83 Following  |  17 Posts  |  Joined: 13.01.2025  |  1.5913

Latest posts by rushivarora.bsky.social on Bluesky

Rushiv has been doing this neat work trying to understand how to bring language into multi-task learning.

09.10.2025 13:34 β€” πŸ‘ 6    πŸ” 2    πŸ’¬ 0    πŸ“Œ 0

Special thanks to @eugenevinitsky.bsky.social for being an amazing mentor!

08.10.2025 15:19 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

2. LEXPOL’s performance benchmarks against previous methods on MetaWorld
3. A combination of LEXPOL with the previous natural language based state embedding algorithm, giving a joint method combing state and action factorization

08.10.2025 15:19 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Paper Highlights:
1. Qualitative analysis of LEXPOL (end-to-end learning) and frozen pre-trained single-task policies. We note that LEXPOL successfully disentangles the tasks into fundamental skills, and learns to combine them without a decomposition to primitive actions.

08.10.2025 15:19 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

LEXPOL is inspired by our ability to combine multiple different sub-skills together to solve larger tasks based on context. It works by factorizing the complexity of multi-task reinforcement learning into smaller learnable pieces.

08.10.2025 15:19 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

This helps since multi-task RL is hard because a single monolithic policy must entangle many skills. LEXPOL factorizes control into smaller learnable pieces and uses language as the router for composition.

08.10.2025 15:19 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

The idea: give the agent a natural-language task description (β€œpush the green button”) and let a learned language gate blend or select among several sub-policies (skills) based on context. One shared state; multiple policies; a gating MLP guided by language embeddings chooses the action.

08.10.2025 15:19 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
Multi-Task Reinforcement Learning with Language-Encoded Gated Policy Networks Multi-task reinforcement learning often relies on task metadata -- such as brief natural-language descriptions -- to guide behavior across diverse objectives. We present Lexical Policy Networks (LEXPO...

I’m very excited to share my new paper introducing LEXical POLicy Networks (LEXPOL) for Multi-Task Reinforcement Learning!



In LEXPOL, language acts as a gate that routes among reusable sub-policies (skills) to solve diverse tasks.
Paper: arxiv.org/abs/2510.06138

08.10.2025 15:19 β€” πŸ‘ 11    πŸ” 2    πŸ’¬ 2    πŸ“Œ 1

Excited to share that an abstract of this paper was accepted at RLDM!

I’m genuinely excited to go for the conferenceβ€”it is one of my favorite venues for interdisciplinary discussions!

19.02.2025 03:31 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

This is a cause that is very close to my heart, and I am glad to have found the NOCC. They do a lot of important work for patients and their caregivers.

12.02.2025 03:21 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

My grandmother passed away from ovarian cancer in 1995. I never got the chance to meet her, but I have always felt extremely connected with her through the countless stories and family heirlooms my parents have shared with me.

12.02.2025 03:21 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
2025 TCS New York City Marathon - Rushiv Arora My grandmother passed away from ovarian cancer in 1995. I never got the chance to meet her, but I have always felt extremely connected with her, and all my ancestors, through the countless stories and...

I am thrilled to announce that I am running the 2025 New York City Marathon for the National Ovarian Cancel Coalition.

I would be grateful if you would consider donating or sharing the message to help me fundraise for this very important cause! Every share counts!

p2p.onecause.com/nycmarathon2...

12.02.2025 03:21 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

This paper is personally meaningful to me since it is my first solo-authored paper! (And 7th overall!).

I got the idea in 2023 in the final months of my master’s but didn’t have the chance to work on it until last year. I am thrilled to finally publish it!

13.01.2025 16:04 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Read the paper to explore how H-UVFAs advance scalable and reusable skills in RL! #ReinforcementLearning #MachineLearning #AI

13.01.2025 16:04 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

- Outperforming UVFAs: In Hierarchical settings, H-UVFAs have superior performance and generalization than UVFAs. In fact, UVFAs failed to learn in some settings.

- Learning in both supervised and reinforcement learning contexts.

13.01.2025 16:04 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Core Contributions:
- Hierarchical Embeddings: We show that it is possible to break down hierarchical value functions into its core elements by leveraging higher-order decomposition methods in Mathematics like Tucker Decompositions.

- Zero-shot generalization: H-UVFAs can extrapolate to new goals!

13.01.2025 16:04 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

We extend Universal Value Function Approximators (UVFAs) to hierarchical RL, enabling zero-shot generalization across new goals in multi-task settings while retaining the benefits of temporal abstraction.

13.01.2025 16:04 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
Hierarchical Universal Value Function Approximators There have been key advancements to building universal approximators for multi-goal collections of reinforcement learning value functions -- key elements in estimating long-term returns of states in a...

First post on here and it’s an exciting one! I am happy to share my new paper β€œHierarchical Universal Value Function Approximators” (H-UVFAs): arxiv.org/abs/2410.08997

13.01.2025 16:04 β€” πŸ‘ 5    πŸ” 1    πŸ’¬ 1    πŸ“Œ 1

@rushivarora is following 20 prominent accounts