π¨ Our paper "Mixture of Expert Graph Transformer for Particle Collision Detection" is now published in Scientific Reports! link.springer.com/article/10.103β¦
@sscardapane.bsky.social , @alessiodevoto.bsky.social , @sgiagu.bsky.social
@alessiodevoto.bsky.social
PhD in ML/AI | Researching Efficient ML/AI (vision & language) π & Interpretability | @SapienzaRoma @EdinburghNLP | https://alessiodevoto.github.io/ | Deep Learning intern @NVIDIA
π¨ Our paper "Mixture of Expert Graph Transformer for Particle Collision Detection" is now published in Scientific Reports! link.springer.com/article/10.103β¦
@sscardapane.bsky.social , @alessiodevoto.bsky.social , @sgiagu.bsky.social
π‘ We compare prompting (zero and multi-shot + explanations) and inference-time interventions (ActAdd, REFT and SAEs).
Following SpARE (@yuzhaouoe.bsky.social @alessiodevoto.bsky.social), we propose β¨ contrastive SAE steering β¨ with mutual info to personalize literary MT by tuning latent features 4/
MMLU-Redux Poster at NAACL 2025
MMLU-Redux just touched down at #NAACL2025! π
Wish I could be there for our "Are We Done with MMLU?" poster today (9:00-10:30am in Hall 3, Poster Session 7), but visa drama said nope π
If anyone's swinging by, give our research some love! Hit me up if you check it out! π
My amazing collaborators will present several works at ICLR and NAACL later this month -- please catch up with them if you're attending! I tried to summarise our recent work in a blog post: neuralnoise.com/2025/march-r...
19.04.2025 08:15 β π 16 π 5 π¬ 0 π 0Please share it within your circles! edin.ac/3DDQK1o
13.03.2025 11:59 β π 14 π 9 π¬ 0 π 1π New Paper Alert! π
We introduce Q-Filters, a training-free method for efficient KV Cache compression!
It is compatible with FlashAttention and can compress along generation which is particularly useful for reasoning models β‘
TLDR: we make Streaming-LLM smarter using the geometry of attention
Live from the CoLoRAI workshop at AAAI
(april-tools.github.io/colorai/)
Nadav Cohen is now giving his talk on "What Makes Data Suitable for Deep Learning?"
Tools from quantum physics are shown to be useful in building more expressive deep learning models by changing the data distribution.
Sanity Checks for Saliency Maps Julius Adebayo, Justin Gilmer, Michael Muelly, Ian Goodfellow, Moritz Hardt, Been Kim Saliency methods have emerged as a popular tool to highlight features in an input deemed relevant for the prediction of a learned model. Several saliency methods have been proposed, often guided by visual appeal on image data. In this work, we propose an actionable methodology to evaluate what kinds of explanations a given method can and cannot provide. We find that reliance, solely, on visual assessment can be misleading. Through extensive experiments we show that some existing saliency methods are independent both of the model and of the data generating process. Consequently, methods that fail the proposed tests are inadequate for tasks that are sensitive to either data or model, such as, finding outliers in the data, explaining the relationship between inputs and outputs that the model learned, and debugging the model. We interpret our findings through an analogy with edge detection in images, a technique that requires neither training data nor model. Theory in the case of a linear model and a single-layer convolutional neural network supports our experimental findings.
Sparse Autoencoders Can Interpret Randomly Initialized Transformers Thomas Heap, Tim Lawson, Lucy Farnik, Laurence Aitchison Sparse autoencoders (SAEs) are an increasingly popular technique for interpreting the internal representations of transformers. In this paper, we apply SAEs to 'interpret' random transformers, i.e., transformers where the parameters are sampled IID from a Gaussian rather than trained on text data. We find that random and trained transformers produce similarly interpretable SAE latents, and we confirm this finding quantitatively using an open-source auto-interpretability pipeline. Further, we find that SAE quality metrics are broadly similar for random and trained transformers. We find that these results hold across model sizes and layers. We discuss a number of number interesting questions that this work raises for the use of SAEs and auto-interpretability in the context of mechanistic interpretability.
2018: Saliency maps give plausible interpretations of random weights, triggering skepticism and catalyzing the mechinterp cultural movement, which now advocates for SAEs.
2025: SAEs give plausible interpretations of random weights, triggering skepticism and ...
Graphical tensor notation for interpretability
www.lesswrong.com/posts/BQKKQi...
Introducing The AI CUDA Engineer: An agentic AI system that automates the production of highly optimized CUDA kernels.
sakana.ai/ai-cuda-engi...
The AI CUDA Engineer can produce highly optimized CUDA kernels, reaching 10-100x speedup over common machine learning operations in PyTorch.
Examples:
It's 2025, and Iβve finally updated my Python setup guide to use uv + venv instead of conda + pip!
Here's my go-to recommendation for uv + venv in Python projects for faster installs, better dependency management: github.com/rasbt/LLMs-f...
(Any additional suggestions?)
A Geometric Framework for Understanding Memorization in Generative Models : arxiv.org/abs/2411.00113
05.02.2025 16:11 β π 2 π 0 π¬ 0 π 0Cool research on how models memorize data π : The 'Manifold Memorization Hypothesis' by Brendan Ross, Hamidreza Kamkariet al. suggests memorization occurs when the model's learned manifold matches the true data manifold but with too small 'local intrinsic dimensionality'.
05.02.2025 16:11 β π 3 π 0 π¬ 1 π 0The super weight in LLMs: arxiv.org/abs/2411.07191
Massive Activations in LLMs: arxiv.org/abs/2402.17762
Massive activations & weights in LLMs, two cool works π€:
- The Super Weight: finds performance can be totally degraded when pruning a *single* weight - Mengxia Yu et al.
- Massive Activations in LLM:finds some (crucial) activations have very high norm irrespective of context - Mingjie Sun et al.
On the last day before the Spring Festival holiday in China, DeepSeek released a NEW work on @hf.co π€―
Janus-Proπ₯ autoregressive framework that unifies multimodal understanding and generation
huggingface.co/deepseek-ai/...
β¨ 1B / 7B
β¨ MIT License
Not only that, but much of the science community here is already stronger and larger than it was on X.
On Twitter, my feed of scientists who study climate-related topics topped out at 3300. Here, weβre at 4500 already and itβs still growing.
Pin here: bsky.app/profile/did:...
*MoE Graph Transformers for Interpretable Particle Collision Detection*
by @alessiodevoto.bsky.social @sgiagu.bsky.social et al.
We propose a MoE graph transformer for particle collision analysis, with many nice interpretability insights (e.g., expert specialization).
arxiv.org/abs/2501.03432
deepseek GGUF just dropped, if you have 207GB disk/40GB RAM for the smallest version huggingface.co/collections/...
08.01.2025 00:20 β π 66 π 13 π¬ 5 π 2Semantic Hub Hypothesis: arxiv.org/abs/2411.04986
Do Llamas Work in English: arxiv.org/abs/2402.10588
LLMs inner representationsπ¬
Llamas Work in English: LLMs default to English-based concept representations, regardless of input language @wendlerc.bsky.social et al
Semantic Hub: Multimodal models create a single shared semantic space, structured by their primary language @zhaofengwu.bsky.social et a
I'll get straight to the point.
We trained 2 new models. Like BERT, but modern. ModernBERT.
Not some hypey GenAI thing, but a proper workhorse model, for retrieval, classification, etc. Real practical stuff.
It's much faster, more accurate, longer context, and more useful. π§΅
Paper link: arxiv.org/abs/2312.101...
19.12.2024 16:34 β π 2 π 0 π¬ 0 π 0In Vision & Audio transformers, not all tokens need the same compute resources! We propose βmodular learnersβ to control compute at token-level granularity (MHA & MLP): hard tokens get more, easy ones get less!
w/ @sscardapane.bsky.social @neuralnoise.com @bartoszWojcik
Soon #AAAI25
Link π
Josh Tenenbaum on scaling up vs growing up and the path to human-like reasoning #NeurIPS2024
15.12.2024 18:14 β π 82 π 7 π¬ 1 π 3*Adaptive Computation Modules: Granular Conditional Computation For Efficient Inference*
with @alessiodevoto.bsky.social @neuralnoise.com
Happy to share our work on distilling efficient transformers with dynamic modules' activation was accepted at #AAAI2025. π₯
arxiv.org/abs/2312.10193
*Sparse Crosscoders for Cross-Layer Features and Model Diffing*
by @colah.bsky.social @anthropic.com
Investigates stability & dynamics of "interpretable features" with cross-layers SAEs. Can also be used to investigate differences in fine-tuned models.
transformer-circuits.pub/2024/crossco...
Very cool work! ππ Unfortunately, errors in the original dataset will propagate to all new languages π
We investigated the issue of existing errors in the original MMLU in
arxiv.org/abs/2406.04127
@aryopg.bsky.social @neuralnoise.com
Super Cool work from Cohere for AI! π However, this highlights a concern raised by our MMLU-Redux team (arxiv.org/abs/2406.04127): **error propagation to many languages**. Issues in MMLU (e.g., "rapid intervention to solve ebola") seem to persist in many languages. Let's solve the root cause first?
06.12.2024 09:38 β π 9 π 3 π¬ 1 π 0