Karsten Kreis's Avatar

Karsten Kreis

@karstenkreis.bsky.social

Principal Research Scientist at NVIDIA | Former Physicist | Deep Generative Learning | https://karstenkreis.github.io/ Opinions are my own.

155 Followers  |  328 Following  |  17 Posts  |  Joined: 04.03.2025
Posts Following

Posts by Karsten Kreis (@karstenkreis.bsky.social)

Post image Post image Post image

Partially-latent flow matching enables sequence-structure codesign of large proteins and functional motif scaffolding.

@kdidi.bsky.social @machine.learning.bio @karstenkreis.bsky.social @arashv.bsky.social

arxiv.org/html/2507.09...

16.07.2025 19:05 β€” πŸ‘ 25    πŸ” 8    πŸ’¬ 1    πŸ“Œ 1

3⃣ Efficient Molecular Conformer Generation with SO(3) Averaged Flow-Matching and Reflow
openreview.net/forum?id=1B1...

⭐️ I'm also on a panel on synthetic data (synthetic-data-iclr.github.io)!

I'm excited to discuss research and to meet new and old friends and collaborators! πŸŽ‰

(5/n)

24.04.2025 08:21 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

@gembioworkshop.bsky.social
papers:

1⃣ EquiJump: Protein Dynamics Simulation via SO(3)-Equivariant Stochastic Interpolants
arxiv.org/abs/2410.09667 (oral)
(screenshot below)

2⃣ Hierarchical Protein Backbone Generation with Latent and Structure Diffusion
arxiv.org/abs/2504.09374

(4/n)

24.04.2025 08:21 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

3⃣ Energy-Based Diffusion Language Models for Text Generation
arxiv.org/abs/2410.21357
Posters 2

4⃣ Truncated Consistency Models
arxiv.org/abs/2410.14895
Posters 4
(screenshot below)

(3/n)

24.04.2025 08:19 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Video thumbnail

Main track:

1⃣ Proteina: Scaling Flow-based Protein Structure Generative Models
research.nvidia.com/labs/genair/...
Orals 3B, posters 4
(video below)

2⃣ ProtComposer: Compositional Protein Structure Generation with 3D Ellipsoids
arxiv.org/abs/2503.05025
Orals 2C, posters 3

(2/n)

24.04.2025 08:18 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

πŸ”₯ I'm at ICLR'25 in Singapore this week - happy to chat!

πŸ“œ With wonderful co-authors, I'm co-presenting 4 main conference papers and 3
@gembioworkshop.bsky.social papers (gembio.ai), and I contribute to a panel (synthetic-data-iclr.github.io).

🧡 Overview in thread.

(1/n)

24.04.2025 08:17 β€” πŸ‘ 3    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0

πŸ”₯ ProtComposer (ICLR'25 Oral) is a Swiss Army knife:

(i) Manually create new protein structure layouts? βœ…
(ii) Generation with favorable designability/diversity/novelty trade-offs? βœ…
(iii) Spatially edit given proteins? βœ…

Very original work by the amazing @hannes-stark.bsky.social and Bowen Jing!πŸ”₯

11.03.2025 01:07 β€” πŸ‘ 7    πŸ” 2    πŸ’¬ 0    πŸ“Œ 0
Preview
Proteina: Scaling Flow-based Protein Structure Generative Models Proteina: Scaling Flow-based Protein Structure Generative Models

πŸ”ΈCheck out our project page (research.nvidia.com/labs/genair/...), our paper (arxiv.org/abs/2503.00710), and our code (github.com/NVIDIA-Digit...).

πŸ”₯ We released 8 sets of weights, for all experiments, for you to play with! πŸ”₯

Enjoy! And see you at ICLR'25! πŸ˜€

(11/11)

04.03.2025 17:17 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Video thumbnail

πŸ”ΈProteina is a fantastic collaboration with wonderful colleagues at NVIDIA:

πŸ”₯ Tomas Geffner*, @kdidi.bsky.social*, Zuobai Zhang*, Danny Reidenbach, Zhonglin Cao, @jyim.bsky.social , Mario Geiger, @machine.learning.bio, Emine Kucukbenli, @arashv.bsky.social, @karstenkreis.bsky.social* πŸ”₯

(10/n)

04.03.2025 17:17 β€” πŸ‘ 7    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0
Video thumbnail

πŸ”ΈWe also demonstrate LoRA-based fine-tuning on a smaller set of high-quality protein structures from the PDB, and we show that autoguidance, where the model is guided by a weaker version of itself, can be used to boost designability. See our paper for details.

(9/n)

04.03.2025 17:13 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

πŸ”ΈProtein structure generation performance is often measured in terms of designability, diversity and novelty. Drawing inspiration from image generation, we explore three complementary metrics that analyze models at the distribution level, providing additional insights.

(8/n)

04.03.2025 17:13 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

πŸ”ΈProteina also outperforms previous models on motif-scaffolding, where a functionally relevant motif is given and the model is tasked with generating a viable supporting scaffold. Below, we show quantitative evaluations for the benchmark introduced by RFDiffusion.

(7/n)

04.03.2025 17:12 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

πŸ”ΈQuantitatively, Proteina achieves state-of-the-art designable and diverse protein backbone generation (unconditional or fold class-conditional). In particular at long lengths, it significantly outperforms previous models, which cannot generate proteins at this scale.

(6/n)

04.03.2025 17:12 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

πŸ”ΈProteina uses an efficient and scalable non-equivariant transformer network with up to 400M parameters. We minimize the use of computationally expensive and memory-consuming layers such as triangle attention, allowing Proteina to generate backbones of up to 800 residues.

(5/n)

04.03.2025 17:11 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

πŸ”ΈThe fold class conditioning provides fine control during generation and allows us to guide with respect to high-level secondary structure content or low-level specific fold classes. The method can also be used to enhance the amount of beta sheets in a controlled manner.

(4/n)

04.03.2025 17:11 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

πŸ”ΈWe train on synthetic datasets of up to 21M protein structures curated from the AlphaFold Database (left plot). Further, we condition Proteina on hierarchical C.A.T.H protein structure classification labels (right plot), with a tailored classifier-free guidance scheme.

(3/n)

04.03.2025 17:10 β€” πŸ‘ 4    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0
Post image

πŸ”ΈProteina is a novel flow-based protein backbone generative model. It uses an alpha carbon backbone representation, is trained with flow matching, relies on a scalable and efficient transformer network, and offers hierarchical fold class conditioning for enhanced control.

(2/n)

04.03.2025 17:10 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Video thumbnail

πŸ“’πŸ“’ "Proteina: Scaling Flow-based Protein Structure Generative Models"

#ICLR2025 (Oral Presentation)

πŸ”₯ Project page: research.nvidia.com/labs/genair/...
πŸ“œ Paper: arxiv.org/abs/2503.00710
πŸ› οΈ Code and weights: github.com/NVIDIA-Digit...

🧡Details in thread...

(1/n)

04.03.2025 17:09 β€” πŸ‘ 39    πŸ” 10    πŸ’¬ 1    πŸ“Œ 4