One more day! One more day!
05.08.2025 01:45 β π 4 π 3 π¬ 0 π 0@scottniekum.bsky.social
Associate prof @ UMass Amherst CICS. AIignment, reinforcement learning, imitation learning, and robotics.
One more day! One more day!
05.08.2025 01:45 β π 4 π 3 π¬ 0 π 0If you havenβt seen this DJ set they did a few years ago, it is worth a watch: www.youtube.com/watch?v=8NWH...
07.07.2025 23:34 β π 4 π 0 π¬ 1 π 0Propose some socials for RLC! Research topics, affinity groups, niche interests, whatever comes to mind!
rl-conference.cc/call_for_soc...
Reminder that early registration for RLC closes on the 30th! Please register early to save yourself some money and help us get the word out.
27.05.2025 14:56 β π 7 π 5 π¬ 0 π 1Itβs about time! πππππππ www.nytimes.com/2025/03/05/t...
05.03.2025 13:05 β π 19 π 4 π¬ 0 π 0Science will take a huge hit; academic stars will leave; university reputations will crumble, and homegrown talent will be even harder to find.
Asia and Europe will profit.
Epic unforced errors, all in the overly narrow pursuit of cutting costs.
Check out some of the exciting changes to the RLC reviewing process! We're always trying new things to perfect it
26.01.2025 02:14 β π 6 π 1 π¬ 0 π 0Fellow THOU fan here. ConvergeβJane Doe will forever be my go-to paper writing album though. Something about that record puts my fingers on automatic. Although this morning Iβm ICMLing to this jazz-punk classic and highly recommend: youtu.be/rl4DnYwjjuU?...
21.01.2025 12:29 β π 4 π 0 π¬ 1 π 0Yeah, but on the other hand I might be wrong and underestimating how many people even Yarvin might convince. Iβve previously been annoyed at NYT platforming certain people for op-eds and Iβm having trouble reconciling that with my feelings on this. Maybe interview at least pushes back slightly?
19.01.2025 21:23 β π 2 π 0 π¬ 1 π 0Concretely, when I imagine the population that would fall for the Vance version, I assume that some subset would reject his ideas if they had heard Yarvin's version first, came up with a refutation, and knew how to recognize more sneaky versions due to their prior experience with the rougher version
19.01.2025 20:15 β π 1 π 0 π¬ 1 π 0I think it can be a mistake to platform bad actors for reasons you mentioned. But once you have someone like a VP citing this stuff, it can be good to let a representative show off their own weakest ideas (and maybe even more effective when you are semi neutral and let the listener do the thinking)
19.01.2025 20:00 β π 1 π 0 π¬ 1 π 0The arguments were so wildly broken (governance shares no meaningful properties with a laptop!) that I hope it would be self evident to many. But with versions of these ideas hitting the mainstream via Thiel, Vance, etc., thereβs no hiding from it, and Iβd rather people see this unvarnished version
19.01.2025 19:53 β π 1 π 0 π¬ 1 π 0I still think it is good on balance that the interview happened. These are ideas that are now being pushed by more sophisticated, savvy, and powerful people than Yarvin. Hearing them in raw form helps to inoculate people against them before they get repackaged in sneakier, more palatable forms IMO.
19.01.2025 18:26 β π 1 π 0 π¬ 1 π 0First page of the paper Influencing Humans to Conform to Preference Models for RLHF, by Hatgis-Kessell et al.
Our proposed method of influencing human preferences.
RLHF algorithms assume humans generate preferences according to normative models. We propose a new method for model alignment: influence humans to conform to these assumptions through interface design. Good news: it works!
#AI #MachineLearning #RLHF #Alignment (1/n)
I'm quite excited about this and still a bit shocked that it works as well as it does. Imitation via distribution matching has always felt like a clunky, brittle way to command agents. Language + zero-shot RL is natural, instantaneous, and scales well, due to the unsupervised nature of RL Zero.
11.12.2024 11:42 β π 0 π 0 π¬ 0 π 0If you're at NeurIPS, RLC is hosting an RL event from 8 till late at The Pearl on Dec. 11th. Join us, meet all the RL researchers, and spread the word!
10.12.2024 21:55 β π 63 π 18 π¬ 2 π 4The call for papers for RLC is now up! Abstract deadline of 2/14, submission deadline of 2/21!
Please help us spread the word.
rl-conference.cc/callforpaper...
RLC will be held at the Univ. of Alberta, Edmonton, in 2025. I'm happy to say that we now have the conference's website out: rl-conference.cc/index.html
Looking forward to seeing you all there!
@rl-conference.bsky.social
#reinforcementlearning
Further details and a call for workshops will be posted soon. We hope to see you all in Amherst this August!
15.11.2023 15:16 β π 0 π 0 π¬ 0 π 0RLC is organized by @yayitsAmyZhang (UT Austin), @GlenBerseth (MILA), @EugeneVinitsky (NYU), @ScottNiekum (UMass Amherst), Philip Thomas (UMass Amherst), and @BrunoSilvaUMass (UMass Amherst)
15.11.2023 15:16 β π 0 π 0 π¬ 1 π 0We have a fantastic advisory board helping to guide us, including @PeterStone_TX, Satinder Singh, @EmmaBrunskill, @mlittmancs, @MannorShie, Michael Bowling, @svlevine, @ravi_iitm, @ShamKakade6, @BenjaminRosman, Marc Deisenroth, and Andrew Barto.
15.11.2023 15:16 β π 1 π 0 π¬ 1 π 0How will @RL_Conference be different from other ML conferences? Besides focusing on RL, peer review will primarily evaluate the correctness and support of claims, rather than subjective perceptions of importance.
15.11.2023 15:15 β π 0 π 0 π¬ 1 π 0Reinforcement Learning as a field has been growing significantly in the past 10 years but lacks a central archival venue. Other communities (CV, NLP, robotics) have benefited from having their own top-tier venues and RL is past due for the same.
15.11.2023 15:15 β π 1 π 0 π¬ 1 π 0Thrilled to announce the first annual Reinforcement Learning Conference @RL_Conference, which will be held at UMass Amherst August 9-12! RLC is the first strongly peer-reviewed RL venue with proceedings, and our call for papers is now available: rl-conference.cc.
15.11.2023 15:15 β π 8 π 1 π¬ 1 π 1