Sneha Kudugunta @NeurIPS2024's Avatar

Sneha Kudugunta @NeurIPS2024

@snehaark.bsky.social

tpu go brr @deep-mind.bsky.social @uwcse.bsky.social | varying proportions of AI and mediocre jokes (not mutually exclusive) | she/her/hers

362 Followers  |  301 Following  |  2 Posts  |  Joined: 12.11.2024  |  1.3343

Latest posts by snehaark.bsky.social on Bluesky

Post image Post image

πŸ“’Thrilled to introduce ATLAS πŸ—ΊοΈ: the largest multilingual scaling study to-dateβ€”we ran 774 exps (10M-8B params, 400+ languages) to answer:

🌍 Is scaling diff by lang?

πŸ§™β€β™‚οΈ Can we model the curse of multilinguality?

βš–οΈ Pretrain vs finetune from checkpoint?

πŸ”€ X-lingual transfer scores across langs?

1/🧡

28.10.2025 14:01 β€” πŸ‘ 17    πŸ” 1    πŸ’¬ 1    πŸ“Œ 1
Matformer introduces nested structure into the Transformer's FFN block & jointly trains all the submodels, enabling free extraction of hundred of accurate submodels for elastic inference

Matformer introduces nested structure into the Transformer's FFN block & jointly trains all the submodels, enabling free extraction of hundred of accurate submodels for elastic inference

I will be at poster #2507 w/ my co-authors in East Exhibit Hall A-C at #NeurIPS2024 chatting about MatFormer and elastic models today at 4.30pm!

Come by, or reach out if you want to chat about pretraining, scaling laws or conditional computation!

arxiv.org/abs/2310.07707

11.12.2024 21:42 β€” πŸ‘ 8    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Would love to be added!

11.12.2024 17:23 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

@snehaark is following 19 prominent accounts