Matteo Robino's Avatar

Matteo Robino

@mrobino.bsky.social

AI & computer vision

32 Followers  |  324 Following  |  3 Posts  |  Joined: 17.11.2024  |  1.6148

Latest posts by mrobino.bsky.social on Bluesky

Video thumbnail

Choudhury and Kim et al., "Accelerating Vision Transformers With Adaptive Patch Sizes"

Transformer patches don't need to be of uniform size -- choose sizes based on entropy --> faster training/inference. Are scale-spaces gonna make a comeback?

22.10.2025 20:08 — 👍 16    🔁 4    💬 3    📌 1
A scatter plot titled “AIME25 — Total Memory vs. Accuracy (Qwen3)” compares model accuracy (%) against total memory usage (weights + KV cache, in GB) for various Qwen3 model sizes and quantization levels.

Axes:
	•	X-axis: Total Memory (Weight + KV Cache) [GB] (log scale, ranging roughly from 1 to 100)
	•	Y-axis: Accuracy (%), ranging from 0 to 75

Legend:
	•	Colors: model sizes —
	•	0.6B (yellow)
	•	1.7B (orange)
	•	4B (salmon)
	•	8B (pink)
	•	14B (purple)
	•	32B (blue)
	•	Shapes: precision levels —
	•	Circle: 16-bit
	•	Triangle: 8-bit
	•	Square: 4-bit
	•	Marker size: context length —
	•	Small: 2k tokens
	•	Large: 30k tokens

Main trend:
Larger models (rightward and darker colors) achieve higher accuracy but require significantly more memory. Smaller models (left, yellow/orange) stay below 30% accuracy. Compression (8-bit or 4-bit) lowers memory usage but can reduce accuracy slightly.

Inset zoom (upper center):
A close-up box highlights the 8B (8-bit) and 14B (4-bit) models showing their proximity in accuracy despite differing memory footprints.

Overall, the chart demonstrates scaling behavior for Qwen3 models—accuracy grows with total memory and model size, with diminishing returns beyond the 14B range.

A scatter plot titled “AIME25 — Total Memory vs. Accuracy (Qwen3)” compares model accuracy (%) against total memory usage (weights + KV cache, in GB) for various Qwen3 model sizes and quantization levels. Axes: • X-axis: Total Memory (Weight + KV Cache) [GB] (log scale, ranging roughly from 1 to 100) • Y-axis: Accuracy (%), ranging from 0 to 75 Legend: • Colors: model sizes — • 0.6B (yellow) • 1.7B (orange) • 4B (salmon) • 8B (pink) • 14B (purple) • 32B (blue) • Shapes: precision levels — • Circle: 16-bit • Triangle: 8-bit • Square: 4-bit • Marker size: context length — • Small: 2k tokens • Large: 30k tokens Main trend: Larger models (rightward and darker colors) achieve higher accuracy but require significantly more memory. Smaller models (left, yellow/orange) stay below 30% accuracy. Compression (8-bit or 4-bit) lowers memory usage but can reduce accuracy slightly. Inset zoom (upper center): A close-up box highlights the 8B (8-bit) and 14B (4-bit) models showing their proximity in accuracy despite differing memory footprints. Overall, the chart demonstrates scaling behavior for Qwen3 models—accuracy grows with total memory and model size, with diminishing returns beyond the 14B range.

Is 32B-4bit equal to 16B-8bit? Depends on the task

* math: precision matters
* knowledge: effective param count is more important
* 4B-8bit threshold — for bigger prefer quant, smaller prefer more params
* parallel TTC only works above 4B-8bit

arxiv.org/abs/2510.10964

15.10.2025 11:10 — 👍 31    🔁 8    💬 3    📌 0
Luc Julia au Sénat : autopsie d'un grand N'IMPORTE QUOI
YouTube video by Monsieur Phi Luc Julia au Sénat : autopsie d'un grand N'IMPORTE QUOI

NOUVELLE VIDEO ! Je décortique le cas Luc Julia, le réputé co-créateur de Siri et expert mondial de l'IA, encensé dans les médias et récemment auditionné au Sénat. Le résultat est salé mais c'était nécessaire.

youtu.be/e5kDHL-nnh4
youtu.be/e5kDHL-nnh4
youtu.be/e5kDHL-nnh4

11.08.2025 14:38 — 👍 395    🔁 163    💬 44    📌 54
Preview
Eleven-minute race for food: how aid points in Gaza became ‘death traps’ – a visual story Hundreds of people have died while seeking food since delivery was taken over by the Gaza Humanitarian Foundation in May. But Palestinians facing extreme hunger have no choice but to take the risk

Je vous implore de lire ceci.

23.07.2025 14:16 — 👍 89    🔁 58    💬 8    📌 6

Why would you ride in a car driven by a human? Do you have some sort of death wish?

28.06.2025 20:16 — 👍 44    🔁 5    💬 6    📌 1

Last month I did a little experiment.

I wanted to see how the exact same post would perform on both X (Twitter) and Bluesky.

The results were...interesting...

[Thread]

16.03.2025 18:08 — 👍 2509    🔁 1414    💬 128    📌 352

New LinkedIn wall background, thanks

15.03.2025 09:41 — 👍 0    🔁 0    💬 0    📌 0
Post image

Want strong SSL, but not the complexity of DINOv2?

CAPI: Cluster and Predict Latents Patches for Improved Masked Image Modeling.

14.02.2025 18:04 — 👍 49    🔁 10    💬 1    📌 1
Preview
DINOv2: Learning Robust Visual Features without Supervision The recent breakthroughs in natural language processing for model pretraining on large quantities of data have opened the way for similar foundation models in computer vision. These models could...

Outstanding Finalist 2: “DINOv2: Learning Robust Visual Features without Supervision," by Maxime Oquab, Timothée Darcet, Théo Moutakanni et al. 5/n openreview.net/forum?id=a68...

08.01.2025 17:41 — 👍 8    🔁 3    💬 2    📌 0

Pourquoi faire

09.01.2025 05:36 — 👍 0    🔁 0    💬 1    📌 0

J’ai l’impression que le graphique nous dit pourtant l’inverse, non?

10.12.2024 14:11 — 👍 0    🔁 0    💬 0    📌 0
Preview
GitHub - verlab/accelerated_features: Implementation of XFeat (CVPR 2024). Do you need robust and fast local feature extraction? You are in the right place! Implementation of XFeat (CVPR 2024). Do you need robust and fast local feature extraction? You are in the right place! - verlab/accelerated_features

XFeat: Accelerated Features for Lightweight Image Matching

code: github.com/verlab/accel...
paper: arxiv.org/abs/2404.19174
project: www.verlab.dcc.ufmg.br/descriptors/...

06.12.2024 13:23 — 👍 2    🔁 1    💬 0    📌 0
Originally the default wallpaper of Microsoft's Windows XP, this photo shows green rolling hills with a vibrant blue sky and white clouds in the background. Charles O'Rear took the photo in California, USA.

Originally the default wallpaper of Microsoft's Windows XP, this photo shows green rolling hills with a vibrant blue sky and white clouds in the background. Charles O'Rear took the photo in California, USA.

We've always been a fan of blueskies.

04.04.1975 12:00 — 👍 11865    🔁 2119    💬 652    📌 656

Free speech on twitter:

01.12.2024 02:30 — 👍 109    🔁 5    💬 5    📌 0
Post image Post image Post image Post image

My deep learning course at the University of Geneva is available on-line. 1000+ slides, ~20h of screen-casts. Full of examples in PyTorch.

fleuret.org/dlc/

And my "Little Book of Deep Learning" is available as a phone-formatted pdf (nearing 700k downloads!)

fleuret.org/lbdl/

26.11.2024 06:15 — 👍 1259    🔁 249    💬 46    📌 17
Preview
GitHub - davidgasquez/docs-to-llmstxt: 🤖 Compile docs into text files for LLMs 🤖 Compile docs into text files for LLMs. Contribute to davidgasquez/docs-to-llmstxt development by creating an account on GitHub.

Inspired by @simonwillison.net llm-docs repo, I did a similar one compiling projects docs into single TXT files that can be fed to LLMs.

Right now, it only has atproto docs but already been useful to me to answer random questions about the project.

github.com/davidgasquez...

19.11.2024 09:48 — 👍 64    🔁 6    💬 4    📌 0

@mrobino is following 20 prominent accounts