Papers #2-3: arxiv.org/abs/2402.10210 and arxiv.org/abs/2405.00675 from the incredible
@quanquangu.bsky.social. I really like how they explore new techniques for RLHF
@quanquangu.bsky.social
Professor @UCLA, Research Scientist @ByteDance | Recent work: SPIN, SPPO, DPLM 1/2, GPM, MARS | Opinions are my own
Papers #2-3: arxiv.org/abs/2402.10210 and arxiv.org/abs/2405.00675 from the incredible
@quanquangu.bsky.social. I really like how they explore new techniques for RLHF
Pretraining will only end once we find the optimal scaling law.
14.12.2024 08:07 β π 6 π 0 π¬ 1 π 0To better interpret the plot, draw a horizontal line representing a specific target validation loss. Find the points where this line intersects the curves for AdamW and MARS, which will allow you to determine how much speedup, in terms of training tokens, MARS achieves compared to AdamW.
05.12.2024 02:54 β π 0 π 0 π¬ 0 π 0Just added you.
03.12.2024 23:49 β π 1 π 0 π¬ 0 π 0With the delivery of MARS complete, the focus now shifts to delivering new architectures.
Just added you! Welcome!
03.12.2024 01:17 β π 1 π 0 π¬ 0 π 0Just added you.
02.12.2024 21:53 β π 0 π 0 π¬ 0 π 0Just added you.
01.12.2024 00:35 β π 1 π 0 π¬ 0 π 0Just added you!
30.11.2024 04:38 β π 1 π 0 π¬ 0 π 0Just added you!
29.11.2024 22:47 β π 1 π 0 π¬ 1 π 0Just added you.
29.11.2024 22:32 β π 1 π 0 π¬ 1 π 0This Thanksgiving, I want to express my heartfelt gratitude to all the students, colleagues, and collaborators who have contributed to the success of SPIN, SPPO, DPLM, GPM, MARS, and many other projects. Your hard work and dedication continue to be truly inspiring.
29.11.2024 03:22 β π 14 π 0 π¬ 0 π 1Just added you!
28.11.2024 19:29 β π 1 π 0 π¬ 0 π 0Just added you!
28.11.2024 19:16 β π 1 π 0 π¬ 0 π 0Just added you.
28.11.2024 19:14 β π 1 π 0 π¬ 0 π 0Anyone using their real name and interested is welcome!
28.11.2024 02:44 β π 0 π 0 π¬ 0 π 0Just added you. Welcome!
28.11.2024 01:48 β π 1 π 0 π¬ 0 π 0MARS is a unified framework that can be integrated with various precondition techniques. So it can be applied to PSGD. I believe @hessianfree.bsky.social has implemented MARS-PSGD.
28.11.2024 01:48 β π 3 π 0 π¬ 2 π 0Just added you!
28.11.2024 01:44 β π 1 π 0 π¬ 0 π 0Just added you.
28.11.2024 01:43 β π 1 π 0 π¬ 1 π 0Done!
28.11.2024 01:42 β π 1 π 0 π¬ 0 π 0Just added you.
28.11.2024 01:42 β π 0 π 0 π¬ 0 π 0Just added you!
28.11.2024 01:42 β π 0 π 0 π¬ 0 π 0Just added you!
28.11.2024 01:42 β π 1 π 0 π¬ 0 π 0Please reply to this message or DM me if youβd like to be added!
27.11.2024 20:48 β π 3 π 0 π¬ 3 π 0Just added you!
27.11.2024 20:46 β π 1 π 0 π¬ 0 π 0Have added both of you. Feel free to recommend other people.
27.11.2024 09:40 β π 1 π 0 π¬ 0 π 0Tulu 3 SFT mix trending on HuggingFace :D , next step make preferences and RL datasets more accessible.
26.11.2024 16:57 β π 15 π 2 π¬ 0 π 0OLMo 2 is out π₯³ 7B and 13B trained on 5T tokens, and meticulousy instruction tuned using Tulu 3 recipe.
Simply the best fully open models yet.
Really proud of the work & the amazing team at
@ai2.bsky.social
Just added you there.
26.11.2024 21:08 β π 1 π 0 π¬ 0 π 0