I’ll be at @neuripsconf.bsky.social all next week! Find me mostly at the @cohere.com booth / DM me to talk code / post-training / life at Cohere 🇨🇦
03.12.2024 15:17 — 👍 2 🔁 0 💬 0 📌 0
My PhD thesis "Modelling Cross-lingual Transfer For Semantic Parsing" is finally submitted! 🎉🎉🎉
31.01.2024 21:14 — 👍 2 🔁 1 💬 0 📌 0
TRAM is accepted to
#ICLR2024 as a Spotlight! See you in Vienna 🇦🇹! Thanks to @nsaphra.bsky.social, Pradeep Dasigi, Hao Peng and @ai2.bsky.social
Vision experiments, more discussion and visuals coming soon to the camera ready!
16.01.2024 15:36 — 👍 1 🔁 0 💬 0 📌 1
Really excited about this one and had such a blast working with @siree.sh @abertsch.bsky.social @davidthewid.bsky.social @strubell.bsky.social! Please read our paper and reach out with any questions, we'd love to chat! See y'all in Singapore :)
12.10.2023 15:38 — 👍 8 🔁 3 💬 1 📌 0
TRAM is part of my intern project with Hao Peng and Pradeep Dasigi at Allen AI with invaluable contributions from @nsaphra.bsky.social
11.10.2023 09:33 — 👍 0 🔁 0 💬 0 📌 0
TRAM also improves the OOD epsilon sharpness (where SAM has little effect) with a stronger ID and OOD sharpness correlation. This suggests that SAM is only sharpness-aware within the training distribution.
11.10.2023 09:32 — 👍 0 🔁 0 💬 1 📌 0
TRAM is SAM-style optimizer using an alternative to the rho hyperparameter. TRAM instead adapts to the trust region in the function space. TRAM strengthens the connection between task-specific performance and pre-trained structure for better zero-shot domain transfer and cross-lingual transfer.
11.10.2023 09:32 — 👍 1 🔁 0 💬 1 📌 0
🚨 new paper 🚨
Can we train for flat minima with less catastrophic OOD forgetting?
We propose Trust Region Aware Minimization for smoothness in parameters+representations.
TL;DR representations matter just as much!
arxiv.org/abs/2310.03646 w/
@nsaphra.bsky.social Pradeep Dasigi + Hao Peng
11.10.2023 09:31 — 👍 10 🔁 1 💬 1 📌 2
Senior Research Scientist at Google DeepMind. Equitable AI, language, gender, society. She/her.
🌐 jasmijn.bastings.me
Writer http://jalammar.github.io. O'Reilly Author http://LLM-book.com. LLM Builder Cohere.com.
Researcher at Cohere | Multilingual LLM evaluation
Postdoc in ML/NLP at the University of Edinburgh.
Interested in Bottlenecks in Neural Networks; Unargmaxable Outputs.
https://grv.unargmaxable.ai/
I hate slop and yet I work on generative models
PhD from UT Austin, applied scientist @ AWS
He/him • https://bostromk.net
Shaping the future of programming @tessl.io 🚀 | ex-@TwitterCortex @Birdwatch 💙 | PhD in probabilistic machine learning, loyal servant to a cat, collector of random variables, and lover of well-placed puns.
https://mgorinova.github.io/
Senior Research Scientist at Google DeepMind, working on Gemini.
PhD from University of Edinburgh.
ibalazevic.github.io
PhD @ King’s College London • prev CambridgeNLP, TU Wien, intern GoogleDeepmind • NLP, Data-centric ML, Multimodality
http://mubasharaakhtar.com
PhD student in NLP at the University of Edinburgh, working on online abuse detection 👩🏻💻 | ex Intern @MetaAI @Snap | Intersectional feminist 🌻 | (she/her)
PhD Student at Mila and McGill | Research in ML and NLP | Past: AI2, MSFTResearch
arkilpatel.github.io
Postdoctoral researcher at the Institute for Logic, Language and Computation at the University of Amsterdam.
Previously PhD Student at NLPNorth at the IT University of Copenhagen, with internships at AWS, Parameter Lab, Pacmed.
dennisulmer.eu
I lead Cohere For AI. Formerly Research
Google Brain. ML Efficiency, LLMs,
@trustworthy_ml.
We build secure, scalable, and private enterprise-grade AI technology to solve real-world business problems. Join us: http://cohere.com/careers
San Diego Dec 2-7, 25 and Mexico City Nov 30-Dec 5, 25. Comments to this account are not monitored. Please send feedback to townhall@neurips.cc.
Getting paid to complain about LLM Evaluation at Cohere. #NLP #NLProc
https://dennis-aumiller.de
Senior Research Engineer with the Common Crawl Foundation.
(languages ∪ tech) in Dùn Èideann
Postdoc at Mila & McGill University 🇨🇦 with a PhD in NLP from the University of Edinburgh 🏴 memorization vs generalization x (non-)compositionality. she/her 👩💻 🇳🇱
NLP @ Cohere. Prev University of Edinburgh