Daniel Marczak @dmarczak - Bluesky Profile

Latest posts by dmarczak.bsky.social on Bluesky

Check out the paper & code for all the details!
📝 Paper: arxiv.org/abs/2502.04959
💻 Code: github.com/danielm1405/...

Huge thanks to my amazing collaborators:
Simone Magistri, Sebastian Cygert, Bartłomiej Twardowski, Andrew D. Bagdanov, Joost van de Weijer

10.02.2025 14:47 — 👍 1 🔁 0 💬 0 📌 0

In summary: By using a uniform singular value spectrum 📊 and task-specific subspaces 🎯, Iso-CTS achieves state-of-the-art performance across all settings!🔥

10.02.2025 14:47 — 👍 0 🔁 0 💬 1 📌 0

🔍 That’s why we propose replacing the least important components with task-specific vectors that are orthogonal to the common subspace.

This further enhances alignment 🎯, and the performance naturally improves! 📈

10.02.2025 14:47 — 👍 1 🔁 0 💬 1 📌 0

This simple modification boosts task arithmetic by 📈 10-15% across all model merging scenarios, achieving state-of-the-art results in most cases!🔥

However, we found that the bottom components contribute very little to the final performance… 📉⚠️

10.02.2025 14:47 — 👍 1 🔁 0 💬 1 📌 0

Based on this, we propose an isotropic merging framework that:
📊 Flattens the singular value spectrum of task matrices
🎯 Enhances alignment between tasks
⚖️ Reduces the perf gap
Surprisingly, the best performance is achieved when the singular value spectrum is uniform!🚀

10.02.2025 14:47 — 👍 3 🔁 0 💬 1 📌 0

We show that alignment between singular components of task-specific & merged matrices strongly correlates with performance gains over the pre-trained model! 📈

🔍 Tasks that are well-aligned get amplified 🔊, while less aligned ones become underrepresented and struggle. 😬📉

10.02.2025 14:47 — 👍 1 🔁 0 💬 1 📌 0

🚀 What happens when you modify the spectrum of singular values of the merged task vector? 🤔

Apparently, you achieve 🚨state-of-the-art🚨 model merging results! 🔥

✨ Introducing “No Task Left Behind: Isotropic Model Merging with Common and Task-Specific Subspaces”

10.02.2025 14:47 — 👍 6 🔁 4 💬 1 📌 0

Self-supervised Learning with Masked Autoencoders (MAE) is known to produce worse image representations than Joint-Embedding approaches (e.g. DINO). In our new paper, we identify new reasons for why that is and point towards solutions: arxiv.org/abs/2412.03215 🧵

05.12.2024 19:56 — 👍 14 🔁 4 💬 1 📌 0

@dmarczak is following 18 prominent accounts

Jay 🦋
@jay.bsky.team

CEO of Bluesky, steward of AT Protocol. dec/acc 🌱 🪴 🌳

Horace He
@chhillee

@PyTorch "My learning style is Horace twitter threads" - @typedfemale

Adam Goliński
@adamgol

Apple  ML Research in Barcelona, prev OxCSML InfAtEd, part of MLinPL & polonium_org 🇵🇱, sometimes funny

EEML
@eemlcommunity

Strengthening the Eastern European ML community and improving diversity in the field. facebook.com/EEMLcommunity

Andreas Kirsch
@blackhc

My opinions only here. 👨‍🔬 RS DeepMind Past: 👨‍🔬 R Midjourney 1y 🧑‍🎓 DPhil AIMS Uni of Oxford 4.5y 🧙‍♂️ RE DeepMind 1y 📺 SWE Google 3y 🎓 TUM 👤 @nwspk

Ameya P.
@bayesiankitten

Postdoctoral Researcher @ Bethgelab, University of Tübingen Benchmarking | LLM Agents | Data-Centric ML | Continual Learning | Unlearning drimpossible.github.io

Fundamental AI Lab
@funailab

FunAI Lab at the University of Technology Nuremberg led by Prof. Yuki M. Asano website: https://fundamentalailab.github.io/

Mateusz Ostaszewski
@mateuszostaszewski

ML researcher @ Warsaw University of Technology Reinforcement learning / Neural Networks Plasticity / Neural Network Representations / AI4Science

Michael Tschannen
@mtschannen

Research Scientist @GoogleDeepMind. Representation learning for multimodal understanding and generation. mitscha.github.io

Mateusz Klimaszewski
@mklimasz

Building & Evaluating LLMs at the University of Edinburgh Generative AI Lab Fellow (GAIL, gail.ed.ac.uk) mklimasz.github.io

@collasconf

Olivier Hénaff
@olivierhenaff

Working on something new, combining active, multimodal, and memory-augmented learning. Formerly Senior Staff Scientist @GoogleDeepMind, PhD @NYU, @Polytechnique

Dr. Fei-Fei Li
@drfeifei

Prof (CS @Stanford), Co-Director @StanfordHAI, Cofounder/CEO @theworldlabs, CoFounder @ai4allorg #AI #computervision #robotics #AI-healthcare

Kamil Deja
@kdeja

Postdoc at IDEAS NCBR and Warsaw University of Technology

ICLR Conference
@iclr-conf

International Conference on Learning Representations https://iclr.cc/

NeurIPS Blog – NeurIPS conference blog
@blog.neurips.cc.web.brid.gy

[bridged from https://blog.neurips.cc/ on the web: https://fed.brid.gy/web/blog.neurips.cc ]

Jan Dubiński
@jandubinski

PhD student in Machine Learning @Warsaw University of Technology and @IDEAS NCBR

Alaa El-Nouby
@alaaelnouby

Research Scientist at @Apple. Previous: @Meta (FAIR), @Inria, @MSFTResearch, @VectorInst and @UofG . Egyptian 🇪🇬