Ilyass Moummad's Avatar

Ilyass Moummad

@ilyassmoummad.bsky.social

Postdoctoral Researcher @ Inria Montpellier (IROKO, Pl@ntNet) SSL for plant images Interested in Computer Vision, Natural Language Processing, Machine Listening, and Biodiversity Monitoring Website: ilyassmoummad.github.io

374 Followers  |  338 Following  |  65 Posts  |  Joined: 18.11.2024  |  2.18

Latest posts by ilyassmoummad.bsky.social on Bluesky

[10/10] Wrap-up 🎯
πŸ”Ή Unified supervised + unsupervised hashing
πŸ”Ή Flexible: works via probing or LoRA
πŸ”Ή SOTA hashing in minutes on a single GPU

πŸ“„ Paper: arxiv.org/abs/2510.27584
πŸ’» Code: github.com/ilyassmoumma...

Shoutout to my wonderful co-authors Kawtar, HervΓ©, and Alexis.

03.11.2025 14:31 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

[9/10] Strong generalization 🌍
CroVCA produces compact codes that transfer efficiently:
βœ… Single HashCoder trained on ImageNet-1k works on downstream datasets without retraining (More experiments and ablations in the paper)

03.11.2025 14:31 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

[8/10] Semantically consistent retrieval πŸ”
CroVCA retrieves correct classes even for fine-grained or ambiguous queries (e.g., indigo bird, grey langur).
βœ… Outperforms Hashing-Baseline
βœ… Works with only 16 bits and without supervision

03.11.2025 14:31 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

[7/10] Compact yet meaningful codes πŸ’Ύ
Even with just 16 bits, CroVCA preserves class structure.
t-SNE on CIFAR-10 shows clear, separable clusters β€” almost identical to the original 768-dim embeddings.

03.11.2025 14:31 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

[6/10] Strong performance across encoders πŸ’ͺ
Tested on multiple vision encoders (SimDINOv2, DINOv2, DFN…), CroVCA achieves SOTA unsupervised hashing:

03.11.2025 14:30 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

[5/10] Fast convergence πŸš€
CroVCA trains in just ~5 epochs:
βœ… COCO (unsupervised) <2 min
βœ… ImageNet100 (supervised) ~3 min
βœ… Single GPU
Despite simplicity, it achieves state-of-the-art retrieval performance.

03.11.2025 14:30 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

[4/10] HashCoder πŸ› οΈ
A lightweight MLP with final BatchNorm for balanced bits (inspired by OrthoHash). Can be used as:
πŸ”Ή Probe on frozen features
πŸ”Ή LoRA-based fine-tuning for efficient encoder adaptation

03.11.2025 14:30 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

[3/10] Unifying hashing πŸ”„
Can supervised + unsupervised hashing be done in one framework?

CroVCA aligns binary codes across semantically consistent views:
Augmentations β†’ unsupervised
Class-consistent samples β†’ supervised

🧩 One BCE loss + coding-rate regularizer

03.11.2025 14:29 β€” πŸ‘ 1    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0

[2/10] The challenge ⚑
Foundation models (DINOv3, DFN, SWAG…) produce rich embeddings, but similarity search in high-dimensional spaces is expensive.
Hashing provides fast Hamming-distance search, yet most deep hashing methods are complex, slow, and tied to a single paradigm.

03.11.2025 14:29 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

[1/10] Introducing CroVCA ✨
A simple, unified framework for supervised and unsupervised hashing that converts foundation model embeddings into compact binary codes.
βœ… Preserves semantic structure
βœ… Trains in just a few iterations

03.11.2025 14:29 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
BioDCASE Workshop - BioDCASE Join us for the BioDCASE Workshop held in Barcelona, Spain on the 29th of October! The workshop will be held at the Campus del Poblenou of Universitat Pompeu Fabra. The BioDCASE workshop will be hosted the day before the DCASE workshop on the 30-31st of October at the same venue …

BioDCASE workshop - registration closes next week Oct 10th https://biodcase.github.io/workshop2025/ - Hope to see you there! #bioacoustics

03.10.2025 10:17 β€” πŸ‘ 8    πŸ” 10    πŸ’¬ 0    πŸ“Œ 0

I heard that the Linux client is buggy, I use it on the browser and it's working ok.

09.09.2025 07:03 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

for the curious, the code, slides and the article are on Github: github.com/BastienPasde...

29.08.2025 11:44 β€” πŸ‘ 4    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

love it haha wish I were there to hear Prostitute Disfigurement in an amphitheater

29.08.2025 11:40 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 2    πŸ“Œ 0
Preview
NAVIGU: a powerful image collection explorer. NAVIGU lets you dive into the ocean of images. Drag the image sphere or double-click on an image you like to browse large collections.

A website to visually browse and explore the ImageNet-1k dataset (there are other supported datasets: IN-12M, WikiMedia, ETH Images, Pixabay, Fashion) navigu.net#imagenet
(Maybe this is already known, but I was happy to discover it this morning)

27.08.2025 07:39 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Im interested in the quantum and footnotesize, how much params should they have πŸ˜‚

23.08.2025 06:31 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Learning Deep Representations of Data Distributions Landing page for the book Learning Deep Representations of Data Distributions.

Learning Deep Representations of Data Distributions
Sam Buchanan Β· Druv Pai Β· Peng Wang Β· Yi Ma

ma-lab-berkeley.github.io/deep-represe...

The best Deep Learning book is out, I've been waiting for its release for more than a year. Let's learn how to build intelligent systems via compression.

23.08.2025 06:27 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

It feels like we can now fit more noise with more model capacity πŸ€” (Figure 6), maybe we need newer architectures and/or newer training losses.

19.08.2025 21:36 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

1/ Can open-data models beat DINOv2? Today we release Franca, a fully open-sourced vision foundation model. Franca with ViT-G backbone matches (and often beats) proprietary models like SigLIPv2, CLIP, DINOv2 on various benchmarks setting a new standard for open-source research.

21.07.2025 14:47 β€” πŸ‘ 85    πŸ” 21    πŸ’¬ 2    πŸ“Œ 3

πŸ‘‹ I worked on bioacoustics during my PhD, but I post mostly about AI

18.07.2025 20:56 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Research Scientist Build tabular foundation models and shape how the world works with its most valuable data. Opportunity to work on fundamental breakthroughs such as multimodal, causality and specialized architectures.

🏹 Job alert: Research Scientist at Prior Labs

πŸ“Freiburg or Berlin πŸ‡©πŸ‡ͺ
πŸ“… Apply by Dec 31 - preferably earlier
πŸ”— More info: https://bit.ly/4kqn5rY

04.07.2025 06:45 β€” πŸ‘ 6    πŸ” 3    πŸ’¬ 0    πŸ“Œ 0

Congratz! πŸ‘

03.07.2025 10:19 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Of Petrichor Weaves Black Noise
YouTube video by Ne Obliviscaris - Topic Of Petrichor Weaves Black Noise

my new addiction today: youtu.be/dSyJqwN36ow
I can't wait to see them this summer in Motocultor Festival

19.06.2025 09:54 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

the best discovery I've had in recent years, I'm addicted to it now as well 😁

19.06.2025 07:12 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Thank you for making this accessible to everyone! I've read some sections, it is very instructive.

16.06.2025 10:08 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Foundations of Computer Vision The print version was published by

Our computer vision textbook is now available for free online here:
visionbook.mit.edu

We are working on adding some interactive components like search and (beta) integration with LLMs.

Hope this is useful and feel free to submit Github issues to help us improve the text!

15.06.2025 15:45 β€” πŸ‘ 115    πŸ” 32    πŸ’¬ 3    πŸ“Œ 1
Post image

βš οΈβ—Open PhD and Postdoc positions in Prague with Lukas Neumann! β—βš οΈ

We rank #5 in computer vision in Europe and Lukas is a great supervisor, so this is a great opportunity!

If you are interested, contact him, he will also be at CVPR with his group :)

09.06.2025 12:17 β€” πŸ‘ 14    πŸ” 5    πŸ’¬ 1    πŸ“Œ 0
Post image

We will be presenting the πŸ„ FungiTastic πŸ„, a multimodal, highly challenging dataset and benchmark covering many ML problems at @fgvcworkshop.bsky.social CVPR-W on Wednesday!

⏱️ 16:15
πŸ“104 E, Level 1
πŸ“Έ www.kaggle.com/datasets/pic...
πŸ“ƒ arxiv.org/abs/2408.13632

@cvprconference.bsky.social

06.06.2025 16:44 β€” πŸ‘ 18    πŸ” 6    πŸ’¬ 3    πŸ“Œ 0

One of the best conferences that I have been to, happy to have met old friends and having made new ones, hopefully future collaborations as well. Many thanks for organizing this πŸ™

07.06.2025 00:37 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Want stronger Vision Transformers? Use octic-equivariant layers (arxiv.org/abs/2505.15441).

TLDR; We extend @bokmangeorg.bsky.social's reflection-equivariant ViTs to the (octic) group of 90-degree rotations and reflections and... it just works... (DINOv2+DeiT)

Code: github.com/davnords/octic-vits

23.05.2025 07:38 β€” πŸ‘ 29    πŸ” 4    πŸ’¬ 2    πŸ“Œ 3

@ilyassmoummad is following 20 prominent accounts