's Avatar

@tommiekerssies.bsky.social

8 Followers  |  4 Following  |  6 Posts  |  Joined: 31.03.2025  |  1.3683

Latest posts by tommiekerssies.bsky.social on Bluesky

Built by:
πŸ‘¨β€πŸ”¬ Tommie Kerssies, NiccolΓ² Cavagnero, Alexander Hermans, Narges Norouzi, Giuseppe Averta, Bastian Leibe, Gijs Dubbelman, Daan de Geus
πŸ“ TU Eindhoven, Polytechnic of Turin, RWTH Aachen University
#ComputerVision #DeepLearning #ViT #ImageSegmentation #EoMT #CVPR2025
(6/6)

31.03.2025 20:35 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Your ViT is Secretly an Image Segmentation Model (CVPR 2025) CVPR 2025: EoMT shows ViTs can segment efficiently and effectively without adapters or decoders.

Segmentation, simplified.
We’re excited to see what you build on top of it. πŸ› οΈ
🌐 Project: tue-mps.github.io/eomt
πŸ“ Paper: arxiv.org/abs/2503.19108
πŸ’» Code: github.com/tue-mps/eomt
πŸ€— Models: huggingface.co/tue-mps
(5/6)

31.03.2025 20:35 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Why does EoMT work?
Large ViTs pre-trained on rich visual data (like DINOv2 πŸ¦–) can learn the inductive biases needed for segmentation, with no extra components required.
βœ… EoMT removes the clutter and lets the ViT do it all.
(4/6)

31.03.2025 20:35 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

How fast can segmentation get while still maintaining accuracy?
βœ… EoMT achieves an optimal trade-off between accuracy (PQ) πŸ“Š and speed (FPS) ⚑ on COCO, thanks to its simple encoder-only design.
❌ No complex additional components.
❌ No bottlenecks.
πŸš€ Just performance.
(3/6)

31.03.2025 20:35 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

How do modern segmentation models work?
🚫 They chain together complex components:
ViT β†’ Adapter β†’ Pixel Decoder β†’ Transformer Decoder…
βœ… EoMT removes them all.
It keeps only the ViT and adds a few query tokens that guide it to predict masks, no decoder needed.
(2/6)

31.03.2025 20:35 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Image segmentation doesn’t have to be rocket science. πŸš€
Why build a rocket engine full of bolted-on subsystems when one elegant unit does the job? πŸ’‘
That’s what we did for segmentation.
βœ… Meet the Encoder-only Mask Transformer (EoMT): tue-mps.github.io/eomt (CVPR 2025)
(1/6)

31.03.2025 20:35 β€” πŸ‘ 8    πŸ” 4    πŸ’¬ 1    πŸ“Œ 1

@tommiekerssies is following 4 prominent accounts