We introduce MIRO: a new paradigm for T2I model alignment integrating reward conditioning into pretraining, eliminating the need for separate fine-tuning/RL stages. This single-stage approach offers unprecedented efficiency and control.
- 19x faster convergence β‘
- 370x less FLOPS than FLUX-dev π
31.10.2025 11:24 β π 60 π 14 π¬ 3 π 5
Super interesting to see pure SSL outperforms text alignement on a super competitive but text-aligned suited task π€―
18.08.2025 15:44 β π 2 π 0 π¬ 0 π 0
π°οΈ At #CVPR2025 presenting "AnySat: An Earth Observation Model for Any Resolutions, Scales, and Modalities" - Saturday afternoon, Poster 355!
If you're here and want to discuss geolocation or geospatial foundation models, let's connect!
11.06.2025 21:08 β π 13 π 3 π¬ 0 π 0
I will be presenting our work on the detection of archaeological looting with satellite image time series at CVPR 2025 EarthVision workshop tomorrow!
Honored and grateful that this paper received the best student paper award!
11.06.2025 04:03 β π 15 π 6 π¬ 1 π 0
When majority rules, minority loses: bias amplification of gradient descent
Despite growing empirical evidence of bias amplification in machine learning, its theoretical foundations remain poorly understood. We develop a formal framework for majority-minority learning tasks, ...
π’ New preprint!
βWhen majority rules, minority loses: bias amplification of gradient descentβ
We often blame biased data but training also amplifies biases. Our paper explores how ML algorithms favor stereotypes at the expense of minority groups.
β‘οΈ arxiv.org/abs/2505.13122
(1/3)
23.05.2025 16:48 β π 3 π 2 π¬ 1 π 0
We've added new experiments demonstrating robust generalization capabilities! Notably, AnySat shows strong performance on HLS Burn Scars - a sensor never seen during pretraining! π₯π°οΈ
Check it out:
π Paper: arxiv.org/abs/2412.14123
π Project: gastruc.github.io/anysat
30.04.2025 14:00 β π 9 π 3 π¬ 0 π 0
Looking forward to #CVPR2025! We will present the following papers:
30.04.2025 13:04 β π 28 π 7 π¬ 1 π 1
π₯π₯π₯ CV Folks, I have some news! We're organizing a 1-day meeting in center Paris on June 6th before CVPR called CVPR@Paris (similar as NeurIPS@Paris) π₯πΎπ₯π·
Registration is open (it's free) with priority given to authors of accepted papers: cvprinparis.github.io/CVPR2025InPa...
Big π§΅π with details!
21.03.2025 06:43 β π 136 π 51 π¬ 7 π 10
Starter pack including some of the lab members: go.bsky.app/QK8j87w
14.03.2025 10:34 β π 24 π 11 π¬ 0 π 1
π§© Excited to share our paper "RUBIK: A Structured Benchmark for Image Matching across Geometric Challenges" (arxiv.org/abs/2502.19955) accepted to #CVPR2025! We created a benchmark that systematically evaluates image matching methods across well-defined geometric difficulty levels. π
28.02.2025 15:23 β π 19 π 7 π¬ 2 π 0
Weights for CAD are finally available. It's one of the smallest diffusion models on the market, achieving performance close to SD and Pixart, featuring a Perceiver-like architecture.
We leverage our coherence aware training to improve the textual understanding
20.02.2025 12:14 β π 11 π 3 π¬ 0 π 0
π Check it out:
π Paper: arxiv.org/abs/2412.14123
π Project: gastruc.github.io/anysat
π€ HuggingFace: huggingface.co/g-astruc/Any...
π GitHub: github.com/gastruc/AnySat
19.12.2024 10:46 β π 5 π 0 π¬ 0 π 0
π Even better: AnySat supports linear probing for semantic segmentation!
That means you can fine-tune just a few thousand parameters and achieve SOTA results on challenging tasksβall with minimal effort.
19.12.2024 10:46 β π 3 π 0 π¬ 1 π 0
AnySat achieves SOTA performance on 6 tasks across 10 datasets:
π± Land cover mapping
πΎ Crop type segmentation
π³ Tree species classification
π Flood detection
π Change detection
19.12.2024 10:46 β π 2 π 0 π¬ 1 π 0
We trained AnySat on 5 multimodal datasets simultaneously:
π‘ 11 distinct sensors
π Resolutions: 0.2mβ500m
π Revisit: single date to weekly
ποΈ Scales: 0.3β150 hectares
The pretrained model can adapt to truly diverse data, and probably yours too!
19.12.2024 10:46 β π 2 π 0 π¬ 1 π 0
πThanks to our modified JEPA training scheme and scale-adaptive spatial encoders, AnySat trains on datasets with diverse scales, resolutions, and modalities!
π§ 75% of its parameters are shared across all inputs, enabling unmatched flexibility.
19.12.2024 10:46 β π 3 π 0 π¬ 1 π 0
π€ What if embedding multimodal EO data was as easy as using a ResNet on images?
Introducing AnySat: one model for any resolution (0.2mβ250m), scale (0.3β2600 hectares), and modalities (choose from 11 sensors & time series)!
Try it with just a few lines of code:
19.12.2024 10:46 β π 35 π 10 π¬ 2 π 2
Guillaume Astruc, Nicolas Gonthier, Clement Mallet, Loic Landrieu
AnySat: An Earth Observation Model for Any Resolutions, Scales, and Modalities
https://arxiv.org/abs/2412.14123
19.12.2024 06:45 β π 6 π 3 π¬ 0 π 0
β οΈReconstructing sharp 3D meshes from a few unposed images is a hard and ambiguous problem.
βοΈWith MAtCha, we leverage a pretrained depth model to recover sharp meshes from sparse views including both foreground and background, within mins!π§΅
πWebpage: anttwo.github.io/matcha/
11.12.2024 14:59 β π 38 π 11 π¬ 4 π 1
π Guessing where an image was taken is a hard, and often ambiguous problem. Introducing diffusion-based geolocationβwe predict global locations by refining random guesses into trajectories across the Earth's surface!
πΊοΈ Paper, code, and demo: nicolas-dufour.github.io/plonk
10.12.2024 15:56 β π 97 π 32 π¬ 8 π 5
Hi, I am a PhD student from @imagineenpc.bsky.social. Could you also add us both please?
25.11.2024 15:55 β π 4 π 0 π¬ 1 π 0
Assoc. Prof. in computer science at Univ. Bretagne Sud / IRISA
Research in time-series analysis with applications to Earth observations
π©βπ»ππ°οΈπ
Researcher in photogrammetry
Trying to understand scenes in 3D.
Postdoc at @ecoledesponts.bsky.social , PhD at @tugraz.bsky.socialβ¬
Research Scientist @ Google DeepMind - working on video models for science. Worked on video generation; self-supervised learning; VLMs - π¦©; point tracking.
Trending papers in Vision and Graphics on www.scholar-inbox.com.
Scholar Inbox is a personal paper recommender which keeps you up-to-date with the most relevant progress in your field. Follow us and never miss a beat again!
Prof. @notredame.bsky.social. IEEE Computer Society PAMI TC Chair. Computer Vision Foundation CTO. Artificial Intelligence + Digital Humanities + History of Technology. wjscheirer.com
Artist working on and about A.I. https://evar.in
Research scientist at Google Deepmind
generative modeling, rl, birds, poetry, games, robots
πLondon π edouardleurent.com
Postdoc in Digital Narratives @ P1 in Copenhagen; PhD in Computer Vision; conceptual artist; tortured-philosopher; ex-poet
Niantic Spatial, Research.
Throws machine learning at traditional computer vision pipelines to see what sticks. Differentiates the non-differentiable.
πEurope π http://ebrach.github.io
Official Account for the European Conference on Computer Vision (ECCV) #ECCV2026, Malmo πΈπͺ Hosted by @jbhaurum and @CSProfKGD
π₯πΊοΈ Pyrogeographer.
π°οΈπ©οΈπ₯ #RemoteSensing of #Wildfire.
β€οΈπ Passionate about finding ways to help emergency responders.
π£ #SciComm
π©βπ» https://www.kristaleewest.com/
π Colorado, USA
VP Geopatial @ Hexagon | geogeek | cyclist | outdoor enthusiast | curious thinker | continuous learner | Views my own. #geospatial
Assistant Professor Machine Learning & Remote Sensing at Wageningen University, NL
AI Researcher @ LGND AI // Core team @ Climate Change AI
https://konstantinklemmer.github.io
Professor for Data Science in Earth Observation @tumuenchen.bsky.social