Is this #1 in your Spotify wrapped π
26.11.2024 01:14 β π 1 π 0 π¬ 0 π 0
thanks for featuring this work!
19.11.2024 02:04 β π 1 π 0 π¬ 0 π 0
Aioli: A Unified Optimization Framework for Language Model Data Mixing
Language model performance depends on identifying the optimal mixture of data groups to train on (e.g., law, code, math). Prior work has proposed a diverse set of methods to efficiently learn mixture ...
In joint work with @MayeeChen @NickLourie @kchonyc @HazyResearch, we use our optimization framework to analyze failures of existing methods. We then turn these insights into:
Aioli π§, a fully-online data mixing algorithm!
paper: arxiv.org/abs/2411.05735
code: github.com/HazyResearch...
12.11.2024 17:04 β π 0 π 0 π¬ 0 π 0
So you want a good pretraining data mixπ§βπ³, but which data mixing algorithm do you pick? DoGE, DoReMi, Skill-it, grid searching proportionsβ¦ π΅βπ«
It turns out that these algorithms are all special cases of Linear Mixing Optimization, our new data mixing framework! π§΅
12.11.2024 17:04 β π 0 π 0 π¬ 1 π 0
metropolis-hastings:
1οΈβ£ sample from your proposal function
2οΈβ£ run the sample through your filter, proportional to the desired pdf
3οΈβ£ use the kept samples to initialize the next round
i wonder if we can connect iterative approaches to synthetic data as making specific choices in an MCMC framework...
10.11.2024 02:24 β π 0 π 0 π¬ 0 π 0
PhD student at NYU, working on NLP.
https://timchen0618.github.io
Neuro + AI Research Scientist at DeepMind; Affiliate Professor at Columbia Center for Theoretical Neuroscience.
Likes studying learning+memory, hippocampi, and other things brains have and do, too.
she/her.
prev: @BrownUniversity, @uwcse/@uw_wail phd, ex-@cruise, RS @waymo. 0.1x engineer, 10x friend.
spondyloarthritis, cars ruin cities, open source
Official account of the NYU Center for Data Science, the home of the Undergraduate, Masterβs, and Ph.D. programs in data science. cds.nyu.edu
https://mega002.github.io
CS PhD student at Princeton. https://www.cs.princeton.edu/~smalladi/index.html
Researcher in NLP, ML, computer music. Prof @uwcse @uwnlp & helper @allen_ai @ai2_allennlp & familiar to two cats. Single reeds, tango, swim, run, cocktails, ΧΧΦ·ΧΧ’ΦΎΧΧ©ΧΧ, GenX. Opinions not your business.
π || ESM3 || Princeton PhD || MIT BS/MEng || former ai resident @google, intern @nvidia || Bay Area native
Rice University, Associate Professor of Computer Science. Computer Vision, Multimodal AI, Deep Learning. Houston, Texas. Check our work at https://vislang.ai/
training olmos at Ai2, prev at Apple, Penn β¦. π€ dabbler of thingsπΈ πββ¬enjoyer of cats π and mountainsποΈhe/him
Research Scientist at FAIR (Meta), PhD from MIT
NLP, Linguistics, Cognitive Science, AI, ML, etc.
Job currently: Research Scientist (NYC)
Job formerly: NYU Linguistics, MSU Linguistics
NLP β€οΈ | PhD @ CMU, LTI | Prev. Google Research, Microsoft Research | https://simran-khanuja.github.io/
Intern @Google, Ph.D. Student @Cornell_CS.
Interested in machine learning, LLM, brain, and healthcare.
abehrouz.github.io
a mediocre combination of a mediocre AI scientist, a mediocre physicist, a mediocre chemist, a mediocre manager and a mediocre professor.
see more at https://kyunghyuncho.me/
Ph.D. student at @jhuclsp, human LM that hallucinates. Formerly @MetaAI, @uwnlp, and @AWS they/themπ³οΈβπ #NLProc #NLP Crossposting on X.