Yuchen Zhu @zhuyuchen - Bluesky Profile

Latest posts by zhuyuchen.bsky.social on Bluesky

The proxy reward coming from this satisfies our conditions; we include empirical results showing improvement when learned with this proxy reward in the upcoming camera ready version.

n/n

01.05.2025 15:33 — 👍 1 🔁 0 💬 0 📌 0

Apart from the example given, there are also a lot of natural frameworks satisfying our conditions. For example, increased temperature from tempered softmax causes bias in learning reward functions.

9/n

01.05.2025 15:33 — 👍 1 🔁 0 💬 1 📌 0

4. If the expert judges two symptoms as similar, the trainee must also judge the two symptoms as similar except up to some relaxation constant L; similarity is measured by a metric between distributions mapped by the policies. Our sample complexity improvement depends on L.

8/n

01.05.2025 15:33 — 👍 1 🔁 0 💬 1 📌 0

3. There exist a low-dimensional encoding of the image of the proxy policy satisfying some smoothness conditions. Note that this is standard in the majority of machine learning tasks.

7/n

01.05.2025 15:33 — 👍 2 🔁 0 💬 1 📌 0

Crucially it's not necessary that D1=D2.
2. The proxy's image must contain that of the true. This essentially means that all the possible diseases D diagnosable by the expert can also be diagnosed by the trainee, though the trainee may map the wrong symptom to a given D.

6/n

01.05.2025 15:33 — 👍 1 🔁 0 💬 1 📌 0

1. The proxy and true policies must share level sets: using trainee doctors as proxies for expert doctors, then whenever the expert judges two distinct symptoms S1, S2 to indicate the same disease D1, the trainee also judge S1, S2 to indicate the same disease D2.

5/n

01.05.2025 15:33 — 👍 1 🔁 0 💬 1 📌 0

Our work is the first to provide a theoretical foundation of using cheap but noisy rewards for preference learning of large generative models.
What do our technical conditions essentially say?

4/n

01.05.2025 15:33 — 👍 1 🔁 0 💬 1 📌 0

Crucially, we prove that under our conditions the true policy is given by a low-dimensional adaptation of the proxy policy. This leads to a significant sample complexity improvement which we formally prove using PAC theory.

3/n

01.05.2025 15:33 — 👍 1 🔁 0 💬 1 📌 0

Our work discusses sufficient conditions under which proxy rewards can be used to improve the learning of the underlying true policy in preference learning algorithms.

2/n

01.05.2025 15:33 — 👍 1 🔁 0 💬 1 📌 0

When Can Proxies Improve the Sample Complexity of Preference Learning? We address the problem of reward hacking, where maximising a proxy reward does not necessarily increase the true reward. This is a key concern for Large Language Models (LLMs), as they are often fine-...

Arxiv version is already online:
arxiv.org/abs/2412.16475

01.05.2025 15:33 — 👍 1 🔁 0 💬 1 📌 0

New work! 💪🏻💥🤯 When Can Proxies Improve the Sample Complexity of Preference Learning? Our paper is accepted at
@icmlconf.bsky.social 2025. Fantastic joint work with @spectral.space, Zhengyan Shi, @meng-yue-yang.bsky.social, @neuralnoise.com, Matt Kusner, @alexdamour.bsky.social.
1/n

01.05.2025 15:33 — 👍 7 🔁 4 💬 1 📌 1

1. neurips.cc/virtual/2024...
2. neurips.cc/virtual/2024...

11.12.2024 02:28 — 👍 0 🔁 0 💬 0 📌 0

Come talk to me about Causal Abstraction and LLM Theory/Alignment! I'm at #NeurIPS2024 presenting
1. Structured Learning of Compositional Sequential Interventions (Thu 11am-2pm, West Ballroom A-D #5002)
2. Unsupervised Causal Abstraction
(Sunday, CRL workshop)

11.12.2024 02:26 — 👍 4 🔁 0 💬 1 📌 0

@zhuyuchen is following 20 prominent accounts

Magnus Ross
@magnusar

Postdoc at UCL AI Center. Currently work on: forecasting, uncertainty quantification, antimicrobial resistance. Previously: Gaussian processes, physics informed ML, trading power. https://magnusross.github.io

Alexander D'Amour
@alexdamour

Staff Research Scientist at Google DeepMind. Statistics, Data Science, ML, causality, fairness. Prev at Harvard (PhD), UC Berkeley (VAP), Google Brain. Opinions my own. he/him.

Shantanu Gupta
@shantanug

ML Ph.D. student at CMU. shantanu95.github.io

Michael Kevin Spencer
@aisupremacy

Canadian in Taiwan. Emerging tech writer, and analyst with a flagship Newsletter called A.I. Supremacy reaching 115k readers Also watching Semis, China, robotics, Quantum, BigTech, open-source AI and Gen AI tools. https://www.ai-supremacy.com/archive

Osman A Mian
@osmanmian

www.sites.google.com/view/mian/ | Post Doc @ Institute for Artificial Intelligence in Medicine at Uniklinikum Essen | Previously Ph.D. Student. Causal Inference and Discovery @CISPA Helmholtz Center for Information Security

Yorgos
@yfelekis

PhD student in Machine Learning and Causality @ University of Warwick | Enrichment student @ The Alan Turing Institute | These days thinking about Causal Abstractions, Optimal Transport, Information Theory, and Emergence. Website: yfelekis.github.io

Pasquale Minervini
@neuralnoise.com

Researcher in ML/NLP at the University of Edinburgh (faculty at Informatics and EdinburghNLP), Co-Founder/CTO at www.miniml.ai, ELLIS (@ELLIS.eu) Scholar, Generative AI Lab (GAIL, https://gail.ed.ac.uk/) Fellow -- www.neuralnoise.com, he/they

Daniel Augusto
@spectral.space

PhD student at UCL. Working on Gaussian Processes. I love everything mathy, videogames, and language. 🇧🇷 He/Ele/Él Also as @spectraldani@sigmoid.social

Akshay Krishnamurty
@akshaykrish

Principal Scientist/Group Leader @genentech studying stromal-immune cell biology in tumors and tissues. Views are my own

Dylan Foster 🐢
@djfoster

Principal Researcher in AI/ML/RL Theory @ Microsoft Research NE/NYC. Previously @ MIT, Cornell. http://dylanfoster.net RL Theory Lecture Notes: https://arxiv.org/abs/2312.16730

Danru Xu
@danruxu

PhD student in the INtelligent Data Engineering lab at the University of Amsterdam, working in multimodal #causal #representation learning. https://danrux.github.io/

Sebastian Dziadzio
@dziadzio

ELLIS PhD student in machine learning at IMPRS-IS. Continual learning at scale. sebastiandziadzio.com

Cian Eastwood
@cianeastwood

Senior Research Scientist at Valence Labs. Generative modeling (causal, multimodal) and generalisation for scientific discovery. PhD in ML from UofEdinburgh and MPI-IS, with time at Google DeepMind, Meta AI and Spotify. 📍London 🔗 cianeastwood.github.io

Soufiane Mourragui
@soufmourra

Machine Learning @ Ensocell Comp-Bio | Single Cell | Oncology 📍Cambridge, UK Prev. van Oudenaarden group, Hubrecht, TU Delft, NKI Amsterdam & Mines Paris

Naftali Weinberger
@dagophile

Interested in all things causal modeling. Ongoing projects on causal analyses of discrimination and on causation in dynamical systems.

Dan Malinsky
@danielmalinsky

Assistant Professor of Biostatistics at Columbia. I study causal inference, graphical models, machine learning, algorithmic (un)fairness, social + environmental determinants of health, etc. Opinions my own. http://www.dmalinsky.com

@chrisdkolloff

Iván Díaz
@idiaz

Statistician. Associate prof. at NYU Grossman Department of Population Health. Causal inference, machine learning, and semiparametric estimation. https://idiazst.github.io/website/

Fabian Theis
@fabiantheis

Computational biologist @HelmholtzMunich, prof @TU_Muenchen & associate PI @sangerinstitute. Dad of 4 and mountain lover. Department news, see @CompHealthMuc

Rianne de Heide
@rdeheide

Assistant Professor in Mathematical Statistics and Statistical Learning at the University of Twente. https://riannedeheide.github.io Contralto, professional horn player, percussionist, wannabe long-distance runner, mother.