Eduardo Pignatelli's Avatar

Eduardo Pignatelli

@epignatelli.com.bsky.social

Assistant professor (UK Lecturer) at @UCL. PhD at @UCL. Past architect. Previously ML Lead at @burohappold. RL, credit assignment, reward-genesis.

17 Followers  |  12 Following  |  1 Posts  |  Joined: 22.11.2024  |  1.664

Latest posts by epignatelli.com on Bluesky

Great to see BALROG on @bsky.app as well!

25.11.2024 15:00 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

Tired of saturated benchmarks? Want scope for a significant leap in capabilities?

πŸ”₯ Introducing BALROG: a Benchmark for Agentic LLM and VLM Reasoning On Games!

BALROG is a challenging benchmark for LLM agentic capabilities, designed to stay relevant for years to come.

1/🧡

21.11.2024 16:24 β€” πŸ‘ 95    πŸ” 20    πŸ’¬ 4    πŸ“Œ 7

@epignatelli.com is following 11 prominent accounts