Gabriele Trivigno's Avatar

Gabriele Trivigno

@gabtriv.bsky.social

PhD in Computer Vision

36 Followers  |  184 Following  |  11 Posts  |  Joined: 27.11.2024  |  1.6103

Latest posts by gabtriv.bsky.social on Bluesky

πŸ“„ Read the full paper:
SANSA: Unleashing the Hidden Semantics in SAM2 for Few-Shot Segmentation
Now on arXiv β†’ arxiv.org/abs/2505.21795

02.06.2025 18:08 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

πŸ”Ή 4/4 – Promptable segmentation in action SANSA reduces reliance on costly pixel-level masks by supporting point, box, and scribble prompts
πŸ“ˆenabling fast, scalable annotation with minimal supervision.
See the qualitative results πŸ‘‡

02.06.2025 18:08 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

πŸ”Ή 3/4 – SANSA achieves state-of-the-art in few-shot segmentation. We outperform specialist and foundation-based methods across various benchmarks:
πŸ“ˆ +9.3% mIoU on LVIS-92i
⚑ 3Γ— faster than prior works
πŸ’‘ Only 234M parameters (4-5x smaller than competitors)

02.06.2025 18:08 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

πŸ”Ή2/4 – Unlocking semantic structure

SAM2 features are rich, but optimized for tracking.
🧠 Insert bottleneck adapters into frozen SAM2
πŸ“‰ These restructure feature space to disentangle semantics
πŸ“ˆ Result: features cluster semanticallyβ€”even for unseen classes (see PCAπŸ‘‡)

02.06.2025 18:08 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Video thumbnail

πŸš€ As #CVPR2025 week kicks off, meet SANSA: Semantically AligNed Segment Anything 2
We turn SAM2 into a semantic few-shot segmenter:
🧠 Unlocks latent semantics in frozen SAM2
✏️ Supports any prompt: fast and scalable annotation
πŸ“¦ No extra encoders

πŸ“Ž github.com/ClaudiaCutta...

02.06.2025 18:08 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image Post image

#ICCV2025

11.05.2025 15:05 β€” πŸ‘ 4    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0

I guess merging the events could also work πŸ˜‚ I wonder whether cricket players would be better at ComputerVision than CV researchers are at cricket, or viceversa

11.05.2025 15:47 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image Post image Post image

To Match or Not to Match: Revisiting Image Matching for Reliable Visual Place Recognition

Davide Sferrazza, @berton-gabri.bsky.social Gabriele Trivigno, Carlo Masone
tl;dr: global descriptors nowadays are often better than local feature matching methods for simple datasets.
arxiv.org/abs/2504.06116

11.04.2025 11:38 β€” πŸ‘ 11    πŸ” 2    πŸ’¬ 0    πŸ“Œ 0

✨ SAMWISE achieves state-of-the-art performance across multiple #RVOS benchmarksβ€”while being the smallest model in RVOS! 🎯 It also sets a new #SOTA in image-level referring #segmentation. With only 4.9M trainable parameters, it runs #online and requires no fine-tuning of SAM2 πŸš€

10.04.2025 18:11 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

πŸš€ Contributions:
πŸ”Ή Textual Prompts for SAM2: Early fusion of visual-text cues via a novel adapter
πŸ”Ή Temporal Modeling: Essential for video understanding, beyond frame-by-frame object tracking
πŸ”Ή Tracking Bias: Correcting tracking bias in SAM2 for text-aligned object discovery

10.04.2025 18:09 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Video thumbnail

πŸ”₯ Our paper SAMWISE: Infusing Wisdom in SAM2 for Text-Driven Video Segmentation is accepted as a #Highlight at #CVPR2025! πŸŽ‰
We make #SegmentAnything wiser, enabling it to understand textual promptsβ€”training only 4.9M parameters! 🧠
πŸ’» Code, models & demo: github.com/ClaudiaCutta...

Why SAMWISE?πŸ‘‡

10.04.2025 18:07 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image Post image Post image Post image

To Match or Not to Match: Revisiting Image Matching for Reliable Visual Place Recognition

Davide Sferrazza, @berton-gabri.bsky.social, @gabtriv.bsky.social, Carlo Masone

tl;dr:VPR datasets saturate;re-ranking not good;image matching->uncertainty->inlier counts->confidence

arxiv.org/abs/2504.06116

09.04.2025 03:35 β€” πŸ‘ 5    πŸ” 2    πŸ’¬ 0    πŸ“Œ 0
Post image Post image

πŸš€ Paper Release! πŸš€
Curious about image retrieval and contrastive learning? We present:

πŸ“„ "All You Need to Know About Training Image Retrieval Models"
πŸ” The most comprehensive retrieval benchmarkβ€”thousands of experiments across 4 datasets, dozens of losses, batch sizes, LRs, data labeling, and more!

18.03.2025 22:41 β€” πŸ‘ 40    πŸ” 10    πŸ’¬ 2    πŸ“Œ 0

Trying to convince my bluesky feed to put me in the Computer Vision community. Right now I only see posts about the orange-haired president. @berton-gabri.bsky.social @gabrigole.bsky.social how did you do it?

07.04.2025 09:43 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Image segmentation doesn’t have to be rocket science. πŸš€
Why build a rocket engine full of bolted-on subsystems when one elegant unit does the job? πŸ’‘
That’s what we did for segmentation.
βœ… Meet the Encoder-only Mask Transformer (EoMT): tue-mps.github.io/eomt (CVPR 2025)
(1/6)

31.03.2025 20:35 β€” πŸ‘ 8    πŸ” 4    πŸ’¬ 1    πŸ“Œ 1
Post image

Went outside today and thought this would be perfect for my first #bluesky post

05.04.2025 08:39 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

@gabtriv is following 20 prominent accounts