Choudhury and Kim et al., "Accelerating Vision Transformers With Adaptive Patch Sizes"
Transformer patches don't need to be of uniform size -- choose sizes based on entropy --> faster training/inference. Are scale-spaces gonna make a comeback?
@mrobino.bsky.social
AI & computer vision
Choudhury and Kim et al., "Accelerating Vision Transformers With Adaptive Patch Sizes"
Transformer patches don't need to be of uniform size -- choose sizes based on entropy --> faster training/inference. Are scale-spaces gonna make a comeback?
A scatter plot titled “AIME25 — Total Memory vs. Accuracy (Qwen3)” compares model accuracy (%) against total memory usage (weights + KV cache, in GB) for various Qwen3 model sizes and quantization levels. Axes: • X-axis: Total Memory (Weight + KV Cache) [GB] (log scale, ranging roughly from 1 to 100) • Y-axis: Accuracy (%), ranging from 0 to 75 Legend: • Colors: model sizes — • 0.6B (yellow) • 1.7B (orange) • 4B (salmon) • 8B (pink) • 14B (purple) • 32B (blue) • Shapes: precision levels — • Circle: 16-bit • Triangle: 8-bit • Square: 4-bit • Marker size: context length — • Small: 2k tokens • Large: 30k tokens Main trend: Larger models (rightward and darker colors) achieve higher accuracy but require significantly more memory. Smaller models (left, yellow/orange) stay below 30% accuracy. Compression (8-bit or 4-bit) lowers memory usage but can reduce accuracy slightly. Inset zoom (upper center): A close-up box highlights the 8B (8-bit) and 14B (4-bit) models showing their proximity in accuracy despite differing memory footprints. Overall, the chart demonstrates scaling behavior for Qwen3 models—accuracy grows with total memory and model size, with diminishing returns beyond the 14B range.
Is 32B-4bit equal to 16B-8bit? Depends on the task
* math: precision matters
* knowledge: effective param count is more important
* 4B-8bit threshold — for bigger prefer quant, smaller prefer more params
* parallel TTC only works above 4B-8bit
arxiv.org/abs/2510.10964
NOUVELLE VIDEO ! Je décortique le cas Luc Julia, le réputé co-créateur de Siri et expert mondial de l'IA, encensé dans les médias et récemment auditionné au Sénat. Le résultat est salé mais c'était nécessaire.
youtu.be/e5kDHL-nnh4
youtu.be/e5kDHL-nnh4
youtu.be/e5kDHL-nnh4
Why would you ride in a car driven by a human? Do you have some sort of death wish?
28.06.2025 20:16 — 👍 44 🔁 5 💬 6 📌 1Last month I did a little experiment.
I wanted to see how the exact same post would perform on both X (Twitter) and Bluesky.
The results were...interesting...
[Thread]
New LinkedIn wall background, thanks
15.03.2025 09:41 — 👍 0 🔁 0 💬 0 📌 0Want strong SSL, but not the complexity of DINOv2?
CAPI: Cluster and Predict Latents Patches for Improved Masked Image Modeling.
Outstanding Finalist 2: “DINOv2: Learning Robust Visual Features without Supervision," by Maxime Oquab, Timothée Darcet, Théo Moutakanni et al. 5/n openreview.net/forum?id=a68...
08.01.2025 17:41 — 👍 8 🔁 3 💬 2 📌 0Pourquoi faire
09.01.2025 05:36 — 👍 0 🔁 0 💬 1 📌 0J’ai l’impression que le graphique nous dit pourtant l’inverse, non?
10.12.2024 14:11 — 👍 0 🔁 0 💬 0 📌 0XFeat: Accelerated Features for Lightweight Image Matching
code: github.com/verlab/accel...
paper: arxiv.org/abs/2404.19174
project: www.verlab.dcc.ufmg.br/descriptors/...
Originally the default wallpaper of Microsoft's Windows XP, this photo shows green rolling hills with a vibrant blue sky and white clouds in the background. Charles O'Rear took the photo in California, USA.
We've always been a fan of blueskies.
04.04.1975 12:00 — 👍 11865 🔁 2119 💬 652 📌 656Free speech on twitter:
01.12.2024 02:30 — 👍 109 🔁 5 💬 5 📌 0My deep learning course at the University of Geneva is available on-line. 1000+ slides, ~20h of screen-casts. Full of examples in PyTorch.
fleuret.org/dlc/
And my "Little Book of Deep Learning" is available as a phone-formatted pdf (nearing 700k downloads!)
fleuret.org/lbdl/
Inspired by @simonwillison.net llm-docs repo, I did a similar one compiling projects docs into single TXT files that can be fed to LLMs.
Right now, it only has atproto docs but already been useful to me to answer random questions about the project.
github.com/davidgasquez...