Maximilian Weichart's Avatar

Maximilian Weichart

@mweichart.bsky.social

Optimism in the face of uncertainty https://maximilian-weichart.de/

6 Followers  |  35 Following  |  18 Posts  |  Joined: 23.01.2025  |  1.5517

Latest posts by mweichart.bsky.social on Bluesky

Post image

๐Ÿ“ข Deadline extended!
Submit your work to EWRL โ€” now accepting papers until June 3rd AoE.
This year, we're also offering a fast track for papers accepted at other conferences โšก

Check the website for all the details: euro-workshop-on-reinforcement-learning.github.io/ewrl18/

26.05.2025 14:47 โ€” ๐Ÿ‘ 8    ๐Ÿ” 6    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
R1: Reinforcement Learning Meetup

Today we concluded our first R1 Reinforcement Learning meetup where I presented and we discussed the paper on AssistanceZero (by @cassidylaidlaw.bsky.social et al.)

If you're interesting in joining & talking about RL check out the meetup ๐Ÿ’ก max-we.github.io/R1/

24.05.2025 14:22 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Super interesting, I didn't know Minecraft could be used in such a way as a GCRL training environment! I'm wondering if one could also extend the agent to use the in-game chat and how that would change the architecture.

12.04.2025 13:59 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Post image

Although 5000+ participants (and counting!) after just 30 minutes would be wonderful, I switched out the form in favor of fillout.com, as it seems like Google-Forms has a serious bot problem.

04.04.2025 15:52 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
R1: Reinforcement Learning Meetup

Interested in RL? I'm planning to assemble a new online meetup, focused on reinforcement learning paper discussions. You can sign up, and as soon as enough people are interested, you'll get an invitation.

More information and registration: max-we.github.io/R1/

04.04.2025 15:47 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Preview
GitHub - Max-We/alphazero-tetris: An implementation of AlphaZero and MCTS with neural networks for Tetris An implementation of AlphaZero and MCTS with neural networks for Tetris - Max-We/alphazero-tetris

Open-sourced my implementation of AlphaZero and various other MCTS policies to play Tetris. In contrast to other Tetris-agents, this implementation does *not* rely on observation- or action-space simplification. It trains an agent with the same information a human has.

github.com/Max-We/alpha...

21.03.2025 15:37 โ€” ๐Ÿ‘ 2    ๐Ÿ” 1    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Wenn im Endeffekt durch ein solches Format die Kompetenzen eines LLMs statt der Studierenden รผberprรผft wird sollte man sich tatsรคchlich Gedanken machen

21.03.2025 14:29 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

It's always been part of their marketing and certainly doesnt help anyone in the long run...

15.03.2025 11:02 โ€” ๐Ÿ‘ 3    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Post image Post image

Here's a more honest picture with 25 evaluation games (training lasted 3 days, but can be scaled up a lot more!)

06.03.2025 11:06 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Tetris Rollout Viewer

๐Ÿฅณ 50k score achieved, TetrisZero is working! Here's the viewer-site with a replay (actually, the replay became so long that the site is lagging a bit, lol). Full details on the algorithm will follow, once I evaluate it against AlphaZero...

max-we.github.io/tetris-zero/

06.03.2025 10:41 โ€” ๐Ÿ‘ 1    ๐Ÿ” 1    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 0
Video thumbnail

It's getting there! Target is a score of 50k, currently about 10k.

28.02.2025 10:13 โ€” ๐Ÿ‘ 3    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Sweeps Hyperparameter search and model optimization with W&B Sweeps

W&B sweeps is a really nice way of hyperparameter-searching. Didn't see a lot of people talk about it, but it makes the process really nicely streamlined + visualized. Essentially, you just need a config-file with the parameters to try, and it's ready to go

docs.wandb.ai/guides/sweeps/

27.02.2025 11:22 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

I found this via Scholar Inbox today. These are detailed, clear and understandable explanations + exercises to learn with. Thank you, great work!

11.02.2025 14:52 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

Inspired by the interface that DeepMind used for AlphaGo (sadly closed source, at least I couldn't find anything on the web about it)

29.01.2025 19:41 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Video thumbnail

Working on debugging RL algorithms such as AlphaZero is hard, especially when the codebase uses just-in-time-compiled JAX. So I created a replay-viewer which visualizes an episode with all the policy statistics for a personal project. Will be open-sourced once I finish my new algorithm!

29.01.2025 19:37 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

finally!

26.01.2025 19:37 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

Is it normal to thank ChatGPT in the Acknowledgements of your paper nowadays? lol

arxiv.org/pdf/2301.01379

23.01.2025 17:44 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Principles | Perceiving Systems - Max Planck Institute for Intelligent Systems Using computer vision, computer graphics, and machine learning, we teach computers to see people and understand their behavior in complex 3D scenes. We are located in Tรผbingen, Germany.

ps.is.mpg.de/pages/princi...

23.01.2025 13:18 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

This is a really nice alternative (or even better service) for following research papers than _akhaliq on X, nice!

23.01.2025 12:48 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Post image

Excited to share that today our paper recommender platform www.scholar-inbox.com has reached 20k users! We hope to reach 100k by the end of the year.. Lots of new features are being worked on currently and rolled out soon.

15.01.2025 22:03 โ€” ๐Ÿ‘ 190    ๐Ÿ” 26    ๐Ÿ’ฌ 12    ๐Ÿ“Œ 8

@mweichart is following 20 prominent accounts