Christoffer Koo Øhrstrøm's Avatar

Christoffer Koo Øhrstrøm

@chrisohrstrom.bsky.social

PhD student at DTU πŸ‡©πŸ‡° Doing research at the intersection of deep learning, event cameras/neuromorphic vision, multi-modal models, and robotics. https://chrisohrstrom.github.io/

159 Followers  |  438 Following  |  19 Posts  |  Joined: 26.11.2024
Posts Following

Posts by Christoffer Koo Øhrstrøm (@chrisohrstrom.bsky.social)

Looking forward to hear it 🀞 Happy to help if there is more you need.

04.02.2026 19:58 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Always happy to compare to good and interesting work :)

04.02.2026 19:12 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Our experiments use absolute. It would probably work about the same with normalized coordinates though, but I recon it likely requires fiddling a bit with the initialization range of the projection matrix (W_p) if you prefer normalized.

04.02.2026 19:11 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Thanks! I am not too familiar with those tasks, but no I don't think it should be hard to test. And it would be quite interesting to do. Our code is available and the implementation is plug-and-play with standard attention. You only need to give it the nD-position of each token.

04.02.2026 12:17 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Parabolic Position Encoding Where to Attend: A Principled Vision-Centric Position Encoding with Parabolas

And here it is. Maybe something along these lines you were thinking of? Designed directly for vision, tested on 2D, 2D-T, 3D, and multi-modal, and it extrapolates very well.

Paper: arxiv.org/abs/2602.01418
Website: chrisohrstrom.github.io/parabolic-po...
Code: github.com/DTU-PAS/para...

04.02.2026 09:03 β€” πŸ‘ 5    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0
Preview
Where to Attend: A Principled Vision-Centric Position Encoding with Parabolas We propose Parabolic Position Encoding (PaPE), a parabola-based position encoding for vision modalities in attention-based architectures. Given a set of vision tokens-such as images, point clouds, vid...

Where to Attend: A Principled Vision-Centric Position Encoding with Parabolas

Paper: arxiv.org/abs/2602.01418
Website: chrisohrstrom.github.io/parabolic-po...
Code: github.com/DTU-PAS/para...

@rgring.bsky.social @lanalpa.bsky.social

04.02.2026 08:22 β€” πŸ‘ 7    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

What if position encodings were designed for vision from scratch? We introduce PaPEβ€”Parabolic Position Encoding. Outperforms RoPE on 7/8 datasets and extrapolates to higher resolutions without fine-tuning or position interpolation. Paper, code, and website in thread 🧡

04.02.2026 08:22 β€” πŸ‘ 36    πŸ” 7    πŸ’¬ 3    πŸ“Œ 0

Actually working on a principled encoding for 2D, 2D-T, and 3D. Coming soon in a couple of weeks ;)

12.01.2026 12:05 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Congratulations. You are now officially Danish.

08.11.2025 11:51 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
GitHub - DTU-PAS/spiking-patches: Spiking Patches: Asynchronous, Sparse, and Efficient Tokens for Event Cameras Spiking Patches: Asynchronous, Sparse, and Efficient Tokens for Event Cameras - DTU-PAS/spiking-patches

Thanks to my collaborators @rgring.bsky.social @lanalpa.bsky.social.

Try it out for yourself: github.com/DTU-PAS/spik...

03.11.2025 11:43 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

We also get a much smaller input sizes with up to a 6.9x reduction over voxels and up to a 8.9x reduction over frames.

03.11.2025 11:43 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Results are pretty good. Inference speedups are up to 3.4x over voxels for a point cloud network and up to 10.4x over frames for a Transformer.

This comes without sacrificing accuracy. We even outperform voxels and frames in most cases on gesture recognition and object detection.

03.11.2025 11:43 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Spiking Patches works by creating a grid of patches and let each patch act as spiking neuron. A patch increases its potential whenever an event arrives within the patch, and a token is created everytime a patch spikes (when the potential exceeds a threshold).

03.11.2025 11:43 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

We achieve this through tokenization of events. Our tokenizer is called Spiking Patches.

Something cool is that tokens are compatible with GNNs, PCNs, and Transformers.

This is the first time that anyone applies tokenization to events. We hope to encourage more of this.

03.11.2025 11:43 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

What if we could represent events (event cameras) in a way that preserves both asynchrony and spatial sparsity?

Exited to share our latest work where we answer this question positively.

Spiking Patches: Asynchronous, Sparse, and Efficient Tokens for Event Cameras

Paper: arxiv.org/abs/2510.26614

03.11.2025 11:43 β€” πŸ‘ 6    πŸ” 2    πŸ’¬ 1    πŸ“Œ 2

How is external links to be understood? Is it e.g. okay to link to a video (not our own) with examples of a concept that we describe as a preliminary?

13.10.2025 12:26 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Can Dynamic Neural Networks boost Computer Vision and Sensor Fusion?
We are very happy to share this awesome collection of papers on the topic!

08.01.2025 09:33 β€” πŸ‘ 6    πŸ” 2    πŸ’¬ 0    πŸ“Œ 0

True. Not much of an issue on small codebases. Mostly just feels better with a snappier formatter for those.

18.12.2024 19:59 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Ruff An extremely fast Python linter and code formatter, written in Rust.

black is great, but I prefer Ruff because of speed and it is also a really nice linter. docs.astral.sh/ruff/

18.12.2024 07:17 β€” πŸ‘ 5    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Inventors of flow matching have released a comprehensive guide going over the math & code of flow matching!

Also covers variants like non-Euclidean & discrete flow matching.

A PyTorch library is also released with this guide!

This looks like a very good read! πŸ”₯

arxiv: arxiv.org/abs/2412.06264

10.12.2024 08:35 β€” πŸ‘ 109    πŸ” 26    πŸ’¬ 1    πŸ“Œ 1
Preview
SteeredMarigold: Steering Diffusion Towards Depth Completion of Largely Incomplete Depth Maps Even if the depth maps captured by RGB-D sensors deployed in real environments are often characterized by large areas missing valid depth measurements, the vast majority of depth completion methods st...

Maybe this is it? arxiv.org/abs/2409.10202 @jakubgregorek.bsky.social

28.11.2024 09:29 β€” πŸ‘ 8    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0