Interesting. Is this because they have govt contracts and those would be jeopardized? Or uncertainty around whether those models will be banned for use in America, which adds a huge risk premium?
05.05.2025 16:51 β π 0 π 0 π¬ 1 π 0@kawinethayarajh.bsky.social
Postdoc at Princeton PLI. Formerly PhD at Stanford CS. Working on behavioral machine learning. https://kawine.github.io/
Interesting. Is this because they have govt contracts and those would be jeopardized? Or uncertainty around whether those models will be banned for use in America, which adds a huge risk premium?
05.05.2025 16:51 β π 0 π 0 π¬ 1 π 0relevant paper: arxiv.org/abs/2410.08847
19.12.2024 20:28 β π 3 π 0 π¬ 0 π 0for all methods, it is better for data to be on-policy and be labelled as good/bad relative to the current state of the policy.
but ultimately this is a learning dynamics problem that transcends how the data is sampled
unpaired methods work the way we hope paired methods would, simultaneously increasing the relative prob of good outputs and decreasing the relative prob of bad outputs. this allows you to skip SFT entirely
19.12.2024 20:25 β π 2 π 0 π¬ 1 π 0it's not really on vs. off-policy. in theory, paired methods should increase the prob of good outputs, decrease prob of bad outputs. in practice, they decrease *both*. you need to do SFT beforehand so that you can pay this price and hope that relative to the base model, p(good|x) is still higher
19.12.2024 20:23 β π 0 π 0 π¬ 1 π 0all paired preference methods suffer from this problem while also being more inflexible. unpaired preference methods are always the way to go IME
19.12.2024 18:02 β π 1 π 0 π¬ 1 π 0Source: old.reddit.com/r/LocalLLaMA...
26.11.2024 23:35 β π 1 π 0 π¬ 0 π 0π€
26.11.2024 23:35 β π 16 π 2 π¬ 4 π 0RLHF is not the only method for AI alignment. This post introduces modern algorithms like DPO, KTO, and DiscoPOP that offer simpler and more stable alternatives.
Evolution of Preference Optimization Techniques | Hippocampus's Garden
hippocampus-garden.com/preference_o...
I'm excited to kick off my Bluesky presence with wonderful news: Our paper "Reference-Based Metrics Are Biased Against Blind and Low-Vision Users' Image Description Preferences" won a Best Paper Award at the NLP for Positive Impact Workshop at EMNLP! Read it here: aclanthology.org/2024.nlp4pi-...
24.11.2024 18:39 β π 192 π 20 π¬ 4 π 2nominating myself @kawinethayarajh.bsky.social
21.11.2024 19:02 β π 1 π 0 π¬ 1 π 0These differences with the DPO version don't seem statistically significant?
21.11.2024 18:39 β π 1 π 0 π¬ 1 π 0(almost) all good poetry has high perplexity. it's by design something an out-of-the-box llm would be bad at. alignment on one poet would actually help imo.
21.11.2024 05:24 β π 2 π 0 π¬ 1 π 0Moderately grumpy UToronto alumnus nominating myself π
18.11.2024 23:19 β π 0 π 0 π¬ 0 π 0Could i be added (recent alumnus)? Thank you!
18.11.2024 23:16 β π 1 π 0 π¬ 0 π 0Everyone is fixated on replicating o1 when there would be way more utility in figuring out what makes Claude so special.
18.11.2024 18:44 β π 4 π 0 π¬ 0 π 0Would love to be added thanks!
17.11.2024 16:41 β π 1 π 0 π¬ 0 π 0