Gabriele Trivigno

Gabriele Trivigno

@gabtriv.bsky.social

PhD in Computer Vision

35 Followers 184 Following 11 Posts Joined Nov 2024
9 months ago

๐Ÿ“„ Read the full paper:
SANSA: Unleashing the Hidden Semantics in SAM2 for Few-Shot Segmentation
Now on arXiv โ†’ arxiv.org/abs/2505.21795

0 0 0 0
9 months ago
Post image

๐Ÿ”น 4/4 โ€“ Promptable segmentation in action SANSA reduces reliance on costly pixel-level masks by supporting point, box, and scribble prompts
๐Ÿ“ˆenabling fast, scalable annotation with minimal supervision.
See the qualitative results ๐Ÿ‘‡

0 0 1 0
9 months ago

๐Ÿ”น 3/4 โ€“ SANSA achieves state-of-the-art in few-shot segmentation. We outperform specialist and foundation-based methods across various benchmarks:
๐Ÿ“ˆ +9.3% mIoU on LVIS-92i
โšก 3ร— faster than prior works
๐Ÿ’ก Only 234M parameters (4-5x smaller than competitors)

0 0 1 0
9 months ago
Post image

๐Ÿ”น2/4 โ€“ Unlocking semantic structure

SAM2 features are rich, but optimized for tracking.
๐Ÿง  Insert bottleneck adapters into frozen SAM2
๐Ÿ“‰ These restructure feature space to disentangle semantics
๐Ÿ“ˆ Result: features cluster semanticallyโ€”even for unseen classes (see PCA๐Ÿ‘‡)

0 0 1 0
9 months ago
Video thumbnail

๐Ÿš€ As #CVPR2025 week kicks off, meet SANSA: Semantically AligNed Segment Anything 2
We turn SAM2 into a semantic few-shot segmenter:
๐Ÿง  Unlocks latent semantics in frozen SAM2
โœ๏ธ Supports any prompt: fast and scalable annotation
๐Ÿ“ฆ No extra encoders

๐Ÿ“Ž github.com/ClaudiaCutta...

0 0 1 0
10 months ago
Post image Post image

#ICCV2025

4 1 1 0
10 months ago

I guess merging the events could also work ๐Ÿ˜‚ I wonder whether cricket players would be better at ComputerVision than CV researchers are at cricket, or viceversa

1 0 0 0
11 months ago
Post image Post image Post image

To Match or Not to Match: Revisiting Image Matching for Reliable Visual Place Recognition

Davide Sferrazza, @berton-gabri.bsky.social Gabriele Trivigno, Carlo Masone
tl;dr: global descriptors nowadays are often better than local feature matching methods for simple datasets.
arxiv.org/abs/2504.06116

11 2 0 0
11 months ago

โœจ SAMWISE achieves state-of-the-art performance across multiple #RVOS benchmarksโ€”while being the smallest model in RVOS! ๐ŸŽฏ It also sets a new #SOTA in image-level referring #segmentation. With only 4.9M trainable parameters, it runs #online and requires no fine-tuning of SAM2 ๐Ÿš€

0 0 0 0
11 months ago
Post image

๐Ÿš€ Contributions:
๐Ÿ”น Textual Prompts for SAM2: Early fusion of visual-text cues via a novel adapter
๐Ÿ”น Temporal Modeling: Essential for video understanding, beyond frame-by-frame object tracking
๐Ÿ”น Tracking Bias: Correcting tracking bias in SAM2 for text-aligned object discovery

0 0 1 0
11 months ago
Video thumbnail

๐Ÿ”ฅ Our paper SAMWISE: Infusing Wisdom in SAM2 for Text-Driven Video Segmentation is accepted as a #Highlight at #CVPR2025! ๐ŸŽ‰
We make #SegmentAnything wiser, enabling it to understand textual promptsโ€”training only 4.9M parameters! ๐Ÿง 
๐Ÿ’ป Code, models & demo: github.com/ClaudiaCutta...

Why SAMWISE?๐Ÿ‘‡

2 0 1 0
11 months ago
Post image Post image Post image Post image

To Match or Not to Match: Revisiting Image Matching for Reliable Visual Place Recognition

Davide Sferrazza, @berton-gabri.bsky.social, @gabtriv.bsky.social, Carlo Masone

tl;dr:VPR datasets saturate;re-ranking not good;image matching->uncertainty->inlier counts->confidence

arxiv.org/abs/2504.06116

5 2 0 0
11 months ago
Post image Post image

๐Ÿš€ Paper Release! ๐Ÿš€
Curious about image retrieval and contrastive learning? We present:

๐Ÿ“„ "All You Need to Know About Training Image Retrieval Models"
๐Ÿ” The most comprehensive retrieval benchmarkโ€”thousands of experiments across 4 datasets, dozens of losses, batch sizes, LRs, data labeling, and more!

40 10 2 0
11 months ago

Trying to convince my bluesky feed to put me in the Computer Vision community. Right now I only see posts about the orange-haired president. @berton-gabri.bsky.social @gabrigole.bsky.social how did you do it?

0 0 1 0
11 months ago
Post image

Image segmentation doesnโ€™t have to be rocket science. ๐Ÿš€
Why build a rocket engine full of bolted-on subsystems when one elegant unit does the job? ๐Ÿ’ก
Thatโ€™s what we did for segmentation.
โœ… Meet the Encoder-only Mask Transformer (EoMT): tue-mps.github.io/eomt (CVPR 2025)
(1/6)

8 4 1 1
11 months ago
Post image

Went outside today and thought this would be perfect for my first #bluesky post

1 0 0 0