Christina's Avatar

Christina

@christinabaek.bsky.social

PhD at CMU / robust ML

107 Followers  |  35 Following  |  1 Posts  |  Joined: 19.11.2024
Posts Following

Posts by Christina (@christinabaek.bsky.social)

I’m imagining a simpler setup where words are each a single token long and examples are each a random list of 15 words. If pretrained models already encode the notion of offensive, I bet one iteration of DPO with the right hyperparameter can solve this task.

22.11.2024 19:41 — šŸ‘ 1    šŸ” 0    šŸ’¬ 1    šŸ“Œ 0