Gaurav Pandey's Avatar

Gaurav Pandey

@gpandey1.bsky.social

Research Scientist @ IBM Research Reinforcement Learning for LLMs

345 Followers  |  291 Following  |  1 Posts  |  Joined: 11.11.2024  |  1.438

Latest posts by gpandey1.bsky.social on Bluesky

Hear me out: What if the Chinese translations of mathematical problems present in English test sets (e.g. MATH) were not filtered from the pre-training corpora of Qwen and DeepSeek? this means the knowledge is there, just translated. This would also explain the language switching when RL-ing CoT 👇

10.02.2025 15:51 — 👍 7    🔁 2    💬 1    📌 0

Me

19.11.2024 03:18 — 👍 1    🔁 0    💬 0    📌 0

@gpandey1 is following 19 prominent accounts