Gabriele Goletto's Avatar

Gabriele Goletto

@gabrigole.bsky.social

Research Scientist @ Microsoft. ๐Ÿ‘จโ€๐Ÿ’ป https://gabrielegoletto.github.io

525 Followers  |  284 Following  |  4 Posts  |  Joined: 19.11.2024  |  1.4771

Latest posts by gabrigole.bsky.social on Bluesky

Yes please! The animations look really clear to me so it would be a great learning resource with voiceover ๐Ÿ™

09.05.2025 09:12 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

Now on ArXiv our
@cvprconference.bsky.social
#CVPR2025 paper
Learning from Streaming Video with Orthogonal Gradients
Instead of shuffling clips, can we learn from videos fed sequentially, where you see a clip once, in order?
How to deal with the correlation of gradients over training?
1/3

10.04.2025 15:04 โ€” ๐Ÿ‘ 17    ๐Ÿ” 2    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

But I like the (almost) bot-free conversations and there are some really good active accounts!

08.04.2025 05:32 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Check out Kostaโ€™s starter packs (go.bsky.app/M7HGC3Y), thatโ€™s the fastest route. That said, unfortunately, the CV community here has become less active compared to a few months ago.

08.04.2025 05:27 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

Image segmentation doesnโ€™t have to be rocket science. ๐Ÿš€
Why build a rocket engine full of bolted-on subsystems when one elegant unit does the job? ๐Ÿ’ก
Thatโ€™s what we did for segmentation.
โœ… Meet the Encoder-only Mask Transformer (EoMT): tue-mps.github.io/eomt (CVPR 2025)
(1/6)

31.03.2025 20:35 โ€” ๐Ÿ‘ 8    ๐Ÿ” 4    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 1
Video thumbnail

Excited to release the first worldwide aerial image localization method (and demo!)
Take an aerial or satellite image from anywhere in the world, and AstroLoc can (probably) find its location, and provide a precise footprint!
Links to paper, demo and full-length (5 min) video โฌ‡๏ธ

14.02.2025 10:32 โ€” ๐Ÿ‘ 9    ๐Ÿ” 1    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Video thumbnail

๐Ÿ›‘๐Ÿ“ข
HD-EPIC: A Highly-Detailed Egocentric Video Dataset
hd-epic.github.io
arxiv.org/abs/2502.04144
New collected videos
263 annotations/min: recipe, nutrition, actions, sounds, 3D object movement &fixture associations, masks.
26K VQA benchmark to challenge current VLMs
1/N

07.02.2025 11:45 โ€” ๐Ÿ‘ 33    ๐Ÿ” 6    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 4
Video thumbnail

Now on ArXiv
ShowHowTo: Generating Scene-Conditioned Step-by-Step Visual Instructions
arxiv.org/abs/2412.01987
soczech.github.io/showhowto/
Given one real image &variable sequence of text instructions, ShowHowTo generates a multi-step sequence of images *conditioned on the scene in the REAL image*
๐Ÿงต

05.12.2024 15:01 โ€” ๐Ÿ‘ 19    ๐Ÿ” 3    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 1

Hi Kosta, would love to be on this list as well ๐Ÿ˜Š I am working on egocentric video understanding

21.11.2024 10:21 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

@gabrigole is following 20 prominent accounts