AISTATS 2026 will be in Morocco!
30.07.2025 08:07 β π 35 π 10 π¬ 0 π 0@soufianehayou.bsky.social
Asst Professor at Johns Hopkins (AMS and DSAI). Previously: Simons Institute, Oxford stats, Polytechnique. I like to scale up things! https://www.soufianehayou.com/
AISTATS 2026 will be in Morocco!
30.07.2025 08:07 β π 35 π 10 π¬ 0 π 0Shoutout to my collaborators Nikhil Ghosh and Bin Yu for their help with this project.
30.06.2025 21:26 β π 0 π 0 π¬ 1 π 0For more theoretical and empirical results, check our paper:
Paper: arxiv.org/abs/2506.20629
Code: github.com/soufiane001/...
β
PLoP Consistently outperforms other strategies (Attn, MLP)
β
Works across different post-training scenarios: supervised fine-tuning, reinforcement learning
β
Minimal computational overhead
In the worst case, it ties with the best manual approach. Usually, it's better.
NFN measures the alignment between each module (in the pretrained model) and the finetuning task. NFN is a cheap metric that can be calculated in one forward pass. It is based on a large width analysis of module-data alignment and is well suited for LoRA finetuning.
30.06.2025 21:26 β π 0 π 0 π¬ 1 π 0Our solution: PLoP (Precise LoRA Placement) π―
Instead of guessing, it automatically identifies the optimal modules for LoRA placement based on a notion of module-data alignment that we call NFN (Normalised Feature Norms).
β Original LoRA paper: "Prioritize attention"
β Other papers: "Actually, put them in MLP"
β Everyone: just guessing and trying common target modules
LoRA is amazing for finetuning large models cheaply, but WHERE you place the adapters makes a huge difference. Most people are just guessing where to put them (Attention, MLP, etc).
Meet "PLoP" (Precise LoRA Placement) π―, our new method for automatic LoRA placement π§΅
The recent surge in available Research Scientist positions is correlated with the growing need for innovative approaches to scale and improve Large Language Models (LLMs). This trend is also driven by factors such as researchers leaving established companies for startups!
19.01.2025 01:51 β π 1 π 0 π¬ 0 π 0By far, the best intro song in the history of humankind
www.youtube.com/watch?v=zNPX...
Are we hitting a wall with AI scaling? π€
That "plateau" you're seeing in scaling law charts might not be a fundamental limit, but a sign of suboptimal scaling strategies! I wrote a blogpost about this:
www.soufianehayou.com/blog/plateau...
Speculative sampling accelerates inference in LLMs by drafting future tokens which are verified in parallel. With @vdebortoli.bsky.social , A. Galashov & @arthurgretton.bsky.social , we extend this approach to (continuous-space) diffusion models: arxiv.org/abs/2501.05370
10.01.2025 16:30 β π 45 π 10 π¬ 0 π 0People compare AI to past historic breakthroughs π (industrial revolution, internet, etc), but there's a crucial difference: In previous advancements, humans remained the most intelligent beings. This time, we're creating something that could surpass us π€. It's a singularity!β‘οΈ
29.12.2024 22:27 β π 1 π 0 π¬ 0 π 0