๐ข Deadline extended!
Submit your work to EWRL โ now accepting papers until June 3rd AoE.
This year, we're also offering a fast track for papers accepted at other conferences โก
Check the website for all the details: euro-workshop-on-reinforcement-learning.github.io/ewrl18/
26.05.2025 14:47 โ ๐ 8 ๐ 6 ๐ฌ 0 ๐ 0
R1: Reinforcement Learning Meetup
Today we concluded our first R1 Reinforcement Learning meetup where I presented and we discussed the paper on AssistanceZero (by @cassidylaidlaw.bsky.social et al.)
If you're interesting in joining & talking about RL check out the meetup ๐ก max-we.github.io/R1/
24.05.2025 14:22 โ ๐ 1 ๐ 0 ๐ฌ 0 ๐ 0
Super interesting, I didn't know Minecraft could be used in such a way as a GCRL training environment! I'm wondering if one could also extend the agent to use the in-game chat and how that would change the architecture.
12.04.2025 13:59 โ ๐ 0 ๐ 0 ๐ฌ 0 ๐ 0
Although 5000+ participants (and counting!) after just 30 minutes would be wonderful, I switched out the form in favor of fillout.com, as it seems like Google-Forms has a serious bot problem.
04.04.2025 15:52 โ ๐ 0 ๐ 0 ๐ฌ 0 ๐ 0
R1: Reinforcement Learning Meetup
Interested in RL? I'm planning to assemble a new online meetup, focused on reinforcement learning paper discussions. You can sign up, and as soon as enough people are interested, you'll get an invitation.
More information and registration: max-we.github.io/R1/
04.04.2025 15:47 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0
GitHub - Max-We/alphazero-tetris: An implementation of AlphaZero and MCTS with neural networks for Tetris
An implementation of AlphaZero and MCTS with neural networks for Tetris - Max-We/alphazero-tetris
Open-sourced my implementation of AlphaZero and various other MCTS policies to play Tetris. In contrast to other Tetris-agents, this implementation does *not* rely on observation- or action-space simplification. It trains an agent with the same information a human has.
github.com/Max-We/alpha...
21.03.2025 15:37 โ ๐ 2 ๐ 1 ๐ฌ 0 ๐ 0
Wenn im Endeffekt durch ein solches Format die Kompetenzen eines LLMs statt der Studierenden รผberprรผft wird sollte man sich tatsรคchlich Gedanken machen
21.03.2025 14:29 โ ๐ 1 ๐ 0 ๐ฌ 0 ๐ 0
It's always been part of their marketing and certainly doesnt help anyone in the long run...
15.03.2025 11:02 โ ๐ 3 ๐ 0 ๐ฌ 0 ๐ 0
Here's a more honest picture with 25 evaluation games (training lasted 3 days, but can be scaled up a lot more!)
06.03.2025 11:06 โ ๐ 1 ๐ 0 ๐ฌ 0 ๐ 0
Tetris Rollout Viewer
๐ฅณ 50k score achieved, TetrisZero is working! Here's the viewer-site with a replay (actually, the replay became so long that the site is lagging a bit, lol). Full details on the algorithm will follow, once I evaluate it against AlphaZero...
max-we.github.io/tetris-zero/
06.03.2025 10:41 โ ๐ 1 ๐ 1 ๐ฌ 2 ๐ 0
It's getting there! Target is a score of 50k, currently about 10k.
28.02.2025 10:13 โ ๐ 3 ๐ 0 ๐ฌ 0 ๐ 0
Sweeps
Hyperparameter search and model optimization with W&B Sweeps
W&B sweeps is a really nice way of hyperparameter-searching. Didn't see a lot of people talk about it, but it makes the process really nicely streamlined + visualized. Essentially, you just need a config-file with the parameters to try, and it's ready to go
docs.wandb.ai/guides/sweeps/
27.02.2025 11:22 โ ๐ 0 ๐ 0 ๐ฌ 0 ๐ 0
I found this via Scholar Inbox today. These are detailed, clear and understandable explanations + exercises to learn with. Thank you, great work!
11.02.2025 14:52 โ ๐ 1 ๐ 0 ๐ฌ 1 ๐ 0
Inspired by the interface that DeepMind used for AlphaGo (sadly closed source, at least I couldn't find anything on the web about it)
29.01.2025 19:41 โ ๐ 0 ๐ 0 ๐ฌ 0 ๐ 0
Working on debugging RL algorithms such as AlphaZero is hard, especially when the codebase uses just-in-time-compiled JAX. So I created a replay-viewer which visualizes an episode with all the policy statistics for a personal project. Will be open-sourced once I finish my new algorithm!
29.01.2025 19:37 โ ๐ 1 ๐ 0 ๐ฌ 1 ๐ 0
finally!
26.01.2025 19:37 โ ๐ 1 ๐ 0 ๐ฌ 1 ๐ 0
Is it normal to thank ChatGPT in the Acknowledgements of your paper nowadays? lol
arxiv.org/pdf/2301.01379
23.01.2025 17:44 โ ๐ 0 ๐ 0 ๐ฌ 0 ๐ 0
This is a really nice alternative (or even better service) for following research papers than _akhaliq on X, nice!
23.01.2025 12:48 โ ๐ 1 ๐ 0 ๐ฌ 0 ๐ 0
Excited to share that today our paper recommender platform www.scholar-inbox.com has reached 20k users! We hope to reach 100k by the end of the year.. Lots of new features are being worked on currently and rolled out soon.
15.01.2025 22:03 โ ๐ 190 ๐ 26 ๐ฌ 12 ๐ 8
AZBโU.TokyoโNeuro-Symbolic AI, RSM @MITIBMLab, @IBMResearch. Cycling/Gymkhana/Autox. My tweets don't represent the view of my organization. https://scholar.google.com/citations?user=b4UzH5 English tweets only. JP Tweets -> twitter.com/guicho271828
https://unireps.org
Discover why, when and how distinct learning processes yield similar representations, and the degree to which these can be unified.
PhD student at UC Berkeley studying RL and AI safety.
https://cassidylaidlaw.com
This is the official account of EWRL18 - European Workshop on Reinforcement Learning
Official website: https://euro-workshop-on-reinforcement-learning.github.io/ewrl18/
Cluster of Excellence "Machine Learning: New Perspectives for Science" at University of Tรผbingen, Germany. Blog: https://www.machinelearningforscience.de/
Konrad-Zuse-School of Excellence in reliable Artificial Intelligence
#ZuseSchoolsAI sponsored by @daadworldwide.bsky.social, BMBF
PhD student at ETH Zurich
jonhue.github.io
https://ellis-jena.eu is developing+applying #AI #ML in #earth system, #climate & #environmental research.
Partner: @uni-jena.de, https://bgc-jena.mpg.de/en, @dlr-spaceagency.bsky.social, @carlzeissstiftung.bsky.social, https://aiforgood.itu.int
Professor, Santa Fe Institute. Research on AI, cognitive science, and complex systems.
Website: https://melaniemitchell.me
Substack: https://aiguide.substack.com/
Strengthening Europe's Leadership in AI through Research Excellence | ellis.eu
We build probabilistic #MachineLearning and #AI Tools for scientific discovery, especially in Neuroscience. Probably not posted by @jakhmack.bsky.social.
๐ @ml4science.bsky.socialโฌ, Tรผbingen, Germany
Professor, University of Tรผbingen @unituebingen.bsky.social.
Head of Department of Computer Science ๐.
Faculty, Tรผbingen AI Center ๐ฉ๐ช @tuebingen-ai.bsky.social.
ELLIS Fellow, Founding Board Member ๐ช๐บ @ellis.eu.
CV ๐ท, ML ๐ง , Self-Driving ๐, NLP ๐บ
Offizieller Account der Universitรคt Tรผbingen.
Impressum: https://uni-tuebingen.de/impressum/
Datenschutz: https://uni-tuebingen.de/impressum/bluesky-hinweise/
This is the Tรผbingen research campus of the Max Planck Society in Germany. We do basic research in fields of biology, neuroscience, and AI.
For Institute specific updates follow:
@mpicybernetics.bsky.social
@mpi-bio-fml.bsky.social
Full-Stack SE and Digital Gardener ๐ฉ๐ช๐ฏ๐ต
Designer of !Boring Software
notboring.software
Professor a NYU; Chief AI Scientist at Meta.
Researcher in AI, Machine Learning, Robotics, etc.
ACM Turing Award Laureate.
http://yann.lecun.com
We're an Al safety and research company that builds reliable, interpretable, and steerable Al systems. Talk to our Al assistant Claude at Claude.ai.