Iām imagining a simpler setup where words are each a single token long and examples are each a random list of 15 words. If pretrained models already encode the notion of offensive, I bet one iteration of DPO with the right hyperparameter can solve this task.
22.11.2024 19:41 ā š 1 š 0 š¬ 1 š 0