Johannes Ackermann @johannesack

Johannes Ackermann

@johannesack.bsky.social

Reinforcement Learning PhD Student at the University of Tokyo, Prev: Intern at Sakana AI, PFN, M.Sc/B.Sc. from TU Munich johannesack.github.io

31 Followers | 172 Following | 11 Posts | Joined: 19.11.2024 | 1.5824

Latest posts by johannesack.bsky.social on Bluesky

Plus reviewers might look up your submission on arXiv and become biased against you based on affiliation

04.10.2025 09:26 — 👍 0 🔁 0 💬 0 📌 0

If you're from a famous lab it's clearly useful to put it on arXiv, but for less famous labs I'm not sure it's helpful.

You usually don't get that much visibility and risk your ideas getting stolen/"reinvented" afterwards

04.10.2025 09:25 — 👍 0 🔁 0 💬 1 📌 0

Bravo!

18.09.2025 14:59 — 👍 1 🔁 0 💬 0 📌 0

Off-Policy Corrected Reward Modeling for Reinforcement Learning from Human Feedback Reinforcement Learning from Human Feedback (RLHF) allows us to train models, such as language models (LMs), to follow complex human preferences. In RLHF for LMs, we first train an LM using supervised ...

In our paper we provide more details, a theoretical analysis, and numerous ablations!

This was a very fun joint work with Takashi Ishida and Masashi Sugiyama!
Find our paper at arxiv.org/abs/2507.15507, our code at github.com/JohannesAck/... and swing by our poster at COLM in October!

29.07.2025 10:21 — 👍 1 🔁 0 💬 0 📌 0

Of course we also tested our approach for alignment of language models, both on the TL;DR summarization task and a variant of the Alpaca-Farm benchmark.

It results in a notable increase in performance across base models and tasks! (5/6)

29.07.2025 10:21 — 👍 0 🔁 0 💬 1 📌 0

By correcting the RM a few times during training, we can obtain a better final policy.

As illustrated in this 2D toy example, we can successively retrain the RM on the distribution of the current policy allowing us to keep training for longer! (4/6)

29.07.2025 10:21 — 👍 0 🔁 0 💬 1 📌 0

We could simply sample new actions from the current policy and obtain human preference labels, but this is costly and slow.

Instead, we use importance weighting to train an off-policy corrected RM without any additional samples or preference labels needed! (3/6)

29.07.2025 10:21 — 👍 0 🔁 0 💬 1 📌 0

The reward model (RM) is trained on actions sampled from the SFT model.
As we keep training our LM, it deviates from the SFT policy and thus the RM becomes inaccurate, causing stagnation or overoptimization.

We can prevent this by off-policy correcting the RM! (2/6)

29.07.2025 10:21 — 👍 0 🔁 0 💬 1 📌 0

Reward models do not have the capacity to fully capture human preferences.
If they can't represent human preferences, how can we hope to use them to align a language model?

In our #COLM2025 "Off-Policy Corrected Reward Modeling for RLHF", we investigate this issue 🧵

29.07.2025 10:21 — 👍 2 🔁 1 💬 1 📌 0

The photo looks pretty good, I wish they had them in Tokyo!

01.05.2025 08:22 — 👍 1 🔁 0 💬 0 📌 0

An element of feedback to the devs will go missing.

If the interface is really unergonomic but LLMs can figure it out, there won't be enough user complaints to lead to improvement.

Likewise for bad docs if the LLM can just ingest the library's source code

20.11.2024 06:47 — 👍 1 🔁 0 💬 1 📌 0

@johannesack is following 20 prominent accounts

Heiga Zen (全炳河)
@heigazen

Principal Scientist (Director) at Google DeepMind in Japan. 波瀬小⇒一志中⇒鈴鹿高専⇒名工大 (IBM T.J. Watson Research intern)⇒東芝欧州研究所⇒Google (Speech🇬🇧⇒Brain🇯🇵) ⇒Google DeepMind. 3rd generation Korean in Japan.

Vagrant Gautam
@dippedrusk.com

I do research on trustworthy NLP, i.e., social + technical aspects of fairness, reasoning, etc. pronouns: xe/they (Deutsch: keine) nouns: computer scientist, linguist, birder adjectives: trans, queer, autistic https://dippedrusk.com

tksiia
@tksiia

machine learning and LLMs 🤖🗼 https://takashiishida.github.io

Amir-massoud Farahmand
@sologen

Research Goal: Understanding the computational and statistical principles required to design AI/RL agents. Associate Professor at Polytechnique Montréal and Mila. 🇨🇦 academic.sologen.net

Charles Riou
@charles-riou

I share my love for machine learning on my YouTube channel ML New Papers: https://youtube.com/@mlnewpapers-xo5vm?si=3Y1SFI2yV_o8i89n Follow if you like ML! PhD in CS/ML (University of Tokyo) Eng. Diploma from École Polytechnique (France)

Michael Noukhovitch
@mnoukhov

PhD in AI @mila-quebec.bsky.social RLHF and language grounding, whatever that means. Whitespace aficianado. mnoukhov.github.io

Conference on Language Modeling
@colmweb.org

The 2025 Conference on Language Modeling will take place at the Palais des Congrès in Montreal, Canada from October 7-10, 2025

hardmaru
@hardmaru

I work at Sakana AI 🐟🐠🐡 → @sakanaai.bsky.social https://sakana.ai/careers

@sakanaai

Sakana AI is an AI R&D company based in Tokyo, Japan. 🗼🧠 https://sakana.ai/careers

Han Bao
@han-b

Associate Professor@The Institute of Statistical Mathematics, working in ML theory https://hermite.jp/

Csaba Szepesvari
@skiandsolve

⛷️ ML Theorist carving equations and mountain trails | 🚴‍♂️ Biker, Climber, Adventurer | 🧠 Reinforcement Learning: Always seeking higher peaks, steeper walls and better policies. https://ualberta.ca/~szepesva

Dylan Foster 🐢
@djfoster

Principal Researcher in AI/ML/RL Theory @ Microsoft Research NE/NYC. Previously @ MIT, Cornell. http://dylanfoster.net RL Theory Lecture Notes: https://arxiv.org/abs/2312.16730

@simon-du

John Langford
@pitchdropnow

Quanquan Gu
@quanquangu

Professor @UCLA, Research Scientist @ByteDance | Recent work: SPIN, SPPO, DPLM 1/2, GPM, MARS | Opinions are my own

Gergely Neu
@neu-rips

full-time ML theory nerd, part-time AI-non enthusiast

Markus Wulfmeier
@mwulfmeier

Large-Scale Robot Decision Making @GoogleDeepMind European @ELLISforEurope - imitation interaction transfer - priors: @oxfordrobots @berkeley_ai @ETH @MIT

Glen Berseth
@glenberseth

Assistant Prof at @UMontreal @mila-quebec.bsky.social @MontrealRobots . CIFAR AI Chair, RL_Conference chair. Creating generalist problem-solving agents for the real world. He/him/il.

Reinforcement Learning Conference
@rl-conference

Information and updates about RLC 2025 at the University of Alberta from Aug. 5th to 8th! https://rl-conference.cc

Elnaz Alikarami
@elnaza

Neuroscience researcher, Dentist, passionate amateur artist. Research analyst and project lead at Rainlab Quantums, a volunteer of the Neuromatch. Advocate by spirit. An open-science fan. Always trying to make a change! Www.elnazalikarami.com