Paper๐งต (cross-posted at X): When does composition of diffusion models "work"? Intuitively, the reason dog+hat works and dog+horse doesnโt has something to do with independence between the concepts being composed. The tricky part is to formalize exactly what this means. 1/
11.02.2025 05:59 โ ๐ 39 ๐ 15 ๐ฌ 2 ๐ 2
Excited to share Soup-of-Experts, a new neural network architecture that, for any given specific task, can instantiate in a flash a small model that is very good on it.
Made with โค๏ธ at Apple
Thanks to my co-authors David Grangier, Angelos Katharopoulos, and Skyler Seto!
arxiv.org/abs/2502.01804
05.02.2025 09:32 โ ๐ 12 ๐ 4 ๐ฌ 0 ๐ 0
Really proud of these two companion papers by our team at GDM:
1) Joint Learning of Energy-based Models and their Partition Function
arxiv.org/abs/2501.18528
2) Loss Functions and Operators Generated by f-Divergences
arxiv.org/abs/2501.18537
A thread.
31.01.2025 12:06 โ ๐ 14 ๐ 3 ๐ฌ 1 ๐ 1
How do tokens evolve as they are processed by a deep Transformer?
With Josรฉ A. Carrillo, @gabrielpeyre.bsky.social and @pierreablin.bsky.social, we tackle this in our new preprint: A Unified Perspective on the Dynamics of Deep Transformers arxiv.org/abs/2501.18322
ML and PDE lovers, check it out!
31.01.2025 16:56 โ ๐ 95 ๐ 16 ๐ฌ 2 ๐ 0
Byte Pair Encoding is a tokenization method that starts with all characters as initial tokens. It iteratively merges the most frequent adjacent byte pairs in the text, adding new tokens to the vocabulary until reaching a predefined size. The output is a sequence of tokens. https://buff.ly/42oG80f
30.01.2025 06:00 โ ๐ 14 ๐ 2 ๐ฌ 1 ๐ 1
๐ ๐ซ We are opening post-doc positions at the intersection of AI, data science, and medicine:
โข Large Language Models for French medical texts
โข Evaluating digital medical devices: statistics and causal inference
29.01.2025 08:19 โ ๐ 27 ๐ 16 ๐ฌ 1 ๐ 0
Mixture of experts are all the rage when it comes to shipping low-latency LLMs.
Check out this awesome work by Samira et al. about scaling laws for mixture of experts !
28.01.2025 10:15 โ ๐ 3 ๐ 0 ๐ฌ 0 ๐ 0
๐จ One question that has always intrigued me is the role of different ways to increase a model's capacity: parameters, parallelizable compute, or sequential compute?
We explored this through the lens of MoEs:
28.01.2025 06:25 โ ๐ 18 ๐ 8 ๐ฌ 1 ๐ 3
Thrilled to share the latest work from our team at
@Apple
where we achieve interpretable and fine-grained control of LLMs and Diffusion models via Activation Transport ๐ฅ
๐ arxiv.org/abs/2410.23054
๐ ๏ธ github.com/apple/ml-act
0/9 ๐งต
10.12.2024 13:09 โ ๐ 47 ๐ 15 ๐ฌ 3 ๐ 5
The Apple Machine Learning Research (MLR) team in Paris has openings for both FTE roles and a short-term post-doc position to contribute to our team's research agenda. Researchers at Apple's MLR (led by Samy Bengio) target impactful publications in top-tier ML venues and OSS.
18.12.2024 17:05 โ ๐ 13 ๐ 3 ๐ฌ 1 ๐ 2
Congratulations for these new models !!
22.11.2024 10:33 โ ๐ 4 ๐ 0 ๐ฌ 0 ๐ 0
๐๐ผ๐ฒ๐ ๐ฎ๐๐๐ผ๐ฟ๐ฒ๐ด๐ฟ๐ฒ๐๐๐ถ๐๐ฒ ๐ฝ๐ฟ๐ฒ-๐๐ฟ๐ฎ๐ถ๐ป๐ถ๐ป๐ด ๐๐ผ๐ฟ๐ธ ๐ณ๐ผ๐ฟ ๐๐ถ๐๐ถ๐ผ๐ป? ๐ค
Delighted to share AIMv2, a family of strong, scalable, and open vision encoders that excel at multimodal understanding, recognition, and grounding ๐งต
paper: arxiv.org/abs/2411.14402
code: github.com/apple/ml-aim
HF: huggingface.co/collections/...
22.11.2024 08:32 โ ๐ 59 ๐ 19 ๐ฌ 3 ๐ 1
YouTube video by probabl
Why the MinHashEncoder is great for boosted trees
Great video explaining a clever vectorization for learning on strings and dirty categories:
the MinHashEncoder is fast, stateless, and excellent with tree-based learners.
It's in @skrub-data.bsky.social
youtu.be/ZMQrNFef8fg
21.11.2024 10:12 โ ๐ 75 ๐ 8 ๐ฌ 2 ๐ 0
ML Research @ Apple.
Understanding deep learning (generalization, calibration, diffusion, etc).
preetum.nakkiran.org
Machine Learning Research @ Apple (opinions are my own)
Research Scientist at Meta | AI and neural interfaces | Interested in data augmentation, generative models, geometric DL, brain decoding, human pose, โฆ
๐Paris, France ๐ cedricrommel.github.io
Machine learning
Google DeepMind
Paris
Apple ๏ฃฟ ML Research in Barcelona, prev OxCSML InfAtEd, part of MLinPL & polonium_org ๐ต๐ฑ, sometimes funny
Prof at EPFL
AI โข Climbing
PhD student in ML at EPFL ๐จ๐ญworking with Martin Jaggi & Franรงois Fleuret. Previously Apple MLR (intern). https://mpagli.github.io/
Researcher at Criteo. Interested in Bandits, Privacy, Competitive Analysis, Reinforcement Learning.
https://hugorichard.github.io/
Software engineer at probabl, scikit-learn contributor.
Also at:
https://sigmoid.social/@ogrisel
https://github.com/ogrisel
Co-founder and CEO, Mistral AI
The AI-powered developer platform to build, scale, and deliver secure software.
Researcher trying to shape AI towards positive outcomes. ML & Ethics +birds. Generally trying to do the right thing. TIME 100 | TED speaker | Senate testimony provider | Navigating public life as a recluse.
Former: Google, Microsoft; Current: Hugging Face
We are a research team on artificial intelligence for automotive applications working toward assisted and autonomous driving.
--> https://valeoai.github.io/ <--
Research Scientist at Apple Machine Learning Research. Previously ServiceNow and Element AI in Montrรฉal.
Research scientist @valeoai.bsky.social
Alumnus @sorbonne-universite.fr | ENS Paris Saclay (MVA) | Ecole Polytechnique (X2012)
eloiz.github.io
PhD Student in Machine Learning @ICepfl MLO, MSc/BSc from @ETH_en.
haeggee.github.io