Jeroen Van Goey's Avatar

Jeroen Van Goey

@jeroen.vangoey.be

Staff Research Engineer in BioAI at InstaDeep (part of @biontech.bsky.social)- machine learning for personalized cancer vaccines, de novo peptide sequencing and signal peptides. From Belgium πŸ‡§πŸ‡ͺ currently living in Cape Town πŸ‡ΏπŸ‡¦ #bioML #TeamMassSpec

2,152 Followers  |  3,859 Following  |  30 Posts  |  Joined: 14.09.2023  |  1.9024

Latest posts by jeroen.vangoey.be on Bluesky

Post image

Zero-Shot De Novo Peptide Sequencing with Open Post-Translational Modification Discovery www.researchsquare.c...

---
#proteomics #prot-preprint

27.06.2025 11:00 β€” πŸ‘ 4    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0

Xiang Zhang, Jiaqi Wei, Zijie Qiu, Sheng Xu, Nanqing Dong, Zhiqiang Gao, Siqi Sun: Curriculum Learning for Biological Sequence Prediction: The Case of De Novo Peptide Sequencing https://arxiv.org/abs/2506.13485 https://arxiv.org/pdf/2506.13485 https://arxiv.org/html/2506.13485

17.06.2025 09:25 β€” πŸ‘ 2    πŸ” 2    πŸ’¬ 0    πŸ“Œ 0
Figure 2

Figure 2

Figure 3

Figure 3

Figure 4

Figure 4

Table 1

Table 1

Curriculum Learning for Biological Sequence Prediction: The Case of De Novo Peptide Sequencing [new]
Curriculum Learning enhances non-autoregressive peptide sequencing using structured training, adjusting difficulty for refined predictions.

17.06.2025 11:59 β€” πŸ‘ 4    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0
Preview
Data-independent immunopeptidomics discovery of low-abundant bacterial epitopes Mass spectrometry-based immunopeptidomics is a powerful approach to uncover peptides presented by human leukocyte antigen (HLA) molecules that can guide vaccine design and immunotherapies. While data-...

🚨 New preprint! 🚨
Presenting diaPASEF immunopeptidomics for bacterial epitope discovery.
πŸ’₯ Showcasing DIA-NN immunopeptidomics using proteome-wide predicted HLA class I libraries!
🧬 Read more: biorxiv.org/cgi/content/...
#Immunopeptidomics #MassSpec #DIA #HLA

16.05.2025 07:50 β€” πŸ‘ 22    πŸ” 8    πŸ’¬ 0    πŸ“Œ 1
Preview
Data-independent immunopeptidomics discovery of low-abundant bacterial epitopes Mass spectrometry-based immunopeptidomics is a powerful approach to uncover peptides presented by human leukocyte antigen (HLA) molecules that can guide vaccine design and immunotherapies. While data-dependent acquisition (DDA) has been the standard for navigating through the complexity associated with non-enzymatic immunopeptide database searches, data-independent acquisition (DIA) is increasingly adopted in immunopeptidomics research. In this work, we compare diaPASEF to conventional ddaPASEF in terms of global immunopeptidome profiling and bacterial epitope discovery of the model intracellular pathogen Listeria monocytogenes. We show that DIA spectrum-centric workflows that search pseudo-MS/MS spectra complement DDA analysis by uncovering additional human and bacterial immunopeptides. Furthermore, we leveraged DIA-NN for generating and searching proteome-wide predicted HLA class I peptide spectral libraries, scoring approximately 150 million immunopeptide peptide precursors. This approach outperformed other spectrum-based methods in identification of MHC class I peptides and recovered low-abundant peptide precursors missed by other methods. Taken together, our results demonstrate how both DIA spectrum- and peptide-centric immunopeptidomics analysis are promising strategies to identify low-abundant immunopeptides.

(BioRxiv All) Data-independent immunopeptidomics discovery of low-abundant bacterial epitopes: Mass spectrometry-based immunopeptidomics is a powerful approach to uncover peptides presented by human leukocyte antigen (HLA) molecules that can guide vaccine design and… #BioRxiv #MassSpecRSS

16.05.2025 06:58 β€” πŸ‘ 7    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0
Preview
InstaNovo enables diffusion-powered de novo peptide sequencing in large-scale proteomics experiments - Nature Machine Intelligence InstaNovo, a transformer-based model, and InstaNovo+, a multinomial diffusion model, enhance de novo peptide sequencing, enabling discovery of novel peptides, improved therapeutics sequencing coverage...

'We uncover novel biological findings across eight different datasets, including the identification of proteins in HeLa cells undetected by database search, the expansion of the immunopeptidomics dataset by 175% more peptides and the characterization of novel proteolytic cleavages.'

05.04.2025 09:59 β€” πŸ‘ 3    πŸ” 2    πŸ’¬ 0    πŸ“Œ 0
Preview
InstaNovo enables diffusion-powered de novo peptide sequencing in large-scale proteomics experiments - Nature Machine Intelligence InstaNovo, a transformer-based model, and InstaNovo+, a multinomial diffusion model, enhance de novo peptide sequencing, enabling discovery of novel peptides, improved therapeutics sequencing…

InstaNovo enables diffusion-powered de novo peptide sequencing in large-scale proteomics experiments - Nature Machine Intelligence #DL #AI #ML #DeepLearning #ArtificialIntelligence #MachineLearning
www.nature.com/articles/s42...

27.04.2025 04:00 β€” πŸ‘ 3    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0
Post image

Open-Source and FAIR Research Software for Proteomics pubs.acs.org/doi/10....

---
#proteomics #prot-paper

24.04.2025 07:00 β€” πŸ‘ 6    πŸ” 8    πŸ’¬ 0    πŸ“Œ 0
Preview
Open-Source and FAIR Research Software for Proteomics Scientific discovery relies on innovative software as much as experimental methods, especially in proteomics, where computational tools are essential for mass spectrometer setup, data analysis, and in...

A review from the Journal of #Proteome #Research | Open-Source and #FAIR Research Software for #Proteomics | #Bioinformatics #OpenScience #OpenSource πŸ–₯️ πŸ§ͺ πŸ”“
⬇️
pubs.acs.org/doi/10.1021/...

24.04.2025 15:36 β€” πŸ‘ 10    πŸ” 3    πŸ’¬ 1    πŸ“Œ 0
Preview
Open-Source and FAIR Research Software for Proteomics Scientific discovery relies on innovative software as much as experimental methods, especially in proteomics, where computational tools are essential for mass spectrometer setup, data analysis, and interpretation. Since the introduction of SEQUEST, proteomics software has grown into a complex ecosystem of algorithms, predictive models, and workflows, but the field faces challenges, including the increasing complexity of mass spectrometry data, limited reproducibility due to proprietary software, and difficulties integrating with other omics disciplines. Closed-source, platform-specific tools exacerbate these issues by restricting innovation, creating inefficiencies, and imposing hidden costs on the community. Open-source software (OSS), aligned with the FAIR Principles (Findable, Accessible, Interoperable, Reusable), offers a solution by promoting transparency, reproducibility, and community-driven development, which fosters collaboration and continuous improvement. In this manuscript, we explore the role of OSS in computational proteomics, its alignment with FAIR principles, and its potential to address challenges related to licensing, distribution, and standardization. Drawing on lessons from other omics fields, we present a vision for a future where OSS and FAIR principles underpin a transparent, accessible, and innovative proteomics community.

Fantastic review with an unusual history, growing out of a passionate blog post by @willfondrie.com (willfondrie.com/2024/10/the-...), resulting from a storm (in our teacup) on X during @hupo-org.bsky.social 2024. Great teamwork, authors! pubs.acs.org/doi/10.1021/...

24.04.2025 18:24 β€” πŸ‘ 16    πŸ” 11    πŸ’¬ 0    πŸ“Œ 1
Preview
GitHub - greenelab/lab-website-template: An easy-to-use, flexible website template for labs. An easy-to-use, flexible website template for labs. - greenelab/lab-website-template

Use a template like github.com/greenelab/la... and publish to GitHub Pages

20.04.2025 15:47 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

The AI revolution comes to protein sequencing www.science.org/cont...

---
#proteomics #prot-article

20.04.2025 10:20 β€” πŸ‘ 6    πŸ” 3    πŸ’¬ 0    πŸ“Œ 0
Preview
AI Protein Sequencing Breakthrough: Decoding Peptides Faster! AI-powered protein sequencing: Instanovo uses advanced machine learning to decipher peptide structures, revolutionizing proteomics and biomedical research.

Are you looking for something cool to read this weekend? 😎 What about using AI to improve protein sequencing? It's an exciting one! πŸ§ͺ🧬πŸ–₯️ plentyofroom.beehiiv.com/p/ai-protein...

18.04.2025 16:12 β€” πŸ‘ 9    πŸ” 2    πŸ’¬ 0    πŸ“Œ 1
Preview
AI Protein Sequencing Breakthrough: Decoding Peptides Faster! AI-powered protein sequencing: Instanovo uses advanced machine learning to decipher peptide structures, revolutionizing proteomics and biomedical research.

New issue today! πŸŽ‰ I covered a great article out of @dtu.dk and @instadeep.bsky.social: can AI accelerate peptide discovery? 😎πŸ§ͺπŸ–₯️🧬 The answer is yes, thanks to InstaNovo! Read everything and subscribe at πŸ‘‰ plentyofroom.beehiiv.com/p/ai-protein... #AI #machinelearning #biotech #proteinscience

17.04.2025 15:41 β€” πŸ‘ 6    πŸ” 1    πŸ’¬ 0    πŸ“Œ 1
Preview
InstaNovo enables diffusion-powered de novo peptide sequencing in large-scale proteomics experiments - Nature Machine Intelligence InstaNovo, a transformer-based model, and InstaNovo+, a multinomial diffusion model, enhance de novo peptide sequencing, enabling discovery of novel peptides, improved therapeutics sequencing coverage...

2/
What’s new in InstaNovo v1.1?

🧬 +13.5% recall
🎯 +42.6% more exact PSMs
πŸ” 145% more peptides & 35% more proteins @ 5% FDR
πŸ§ͺ Support for 7 PTMs: phosphorylation, deamidation, carbamylation, ammonia loss, oxidation, acetylation, carbamidomethylation

Paper: bit.ly/3XBeJFh

09.04.2025 15:58 β€” πŸ‘ 7    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0
Preview
Introducing the next generation of InstaNovo models Since our InstaNovo paper is now published, we’d like to share an update on what we’ve been working on while our manuscript was under review. With the release of our preprint over a year ago, we were ...

1/
πŸš€ InstaNovo just got a major upgrade.

Our Nature Machine Intelligence paper with @instadeepai presents v0.1β€”but while the paper was under review, we kept building.

Now releasing InstaNovo v1.1, with major gains.
Blog: bit.ly/4lircYB

🧡

09.04.2025 15:58 β€” πŸ‘ 7    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0
Post image

New AI Tool Unlocks Hidden Proteins in Sequencing Breakthrough Artificial intelligence (AI) has b...

https://www.hpcwire.com/2025/04/10/new-ai-tool-unlocks-hidden-proteins-in-sequencing-breakthrough/

#Short #Takes #AlphaFold #Casanovo #folding #InstaNovo […]

[Original post on hpcwire.com]

10.04.2025 13:56 β€” πŸ‘ 1    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0
Preview
AI is helping scientists decode previously inscrutable proteins A new set of artificial intelligence models could make protein sequencing even more powerful for better understanding cell biology and diseases.

The AI models, called InstaNovo and InstaNovo+, are a step toward β€œthe holy grail” of protein research: to unravel the genetic identity of previously unstudied proteins en masse

12.04.2025 13:10 β€” πŸ‘ 6    πŸ” 3    πŸ’¬ 0    πŸ“Œ 0
Post image

Pairwise Attention: Leveraging Mass Differences to Enhance De Novo Sequencing of Mass Spectra www.biorxiv.org/cont...

---
#proteomics #prot-preprint

04.04.2025 07:40 β€” πŸ‘ 4    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

OpenAI, Perplexity and German drugmaker BioNTech and its London-based AI subsidiary InstaDeep have recently launched their own AI research tools, while Google DeepMind’s AlphaFold has shown how the fast-developing technology can accelerate scientific research.

20.02.2025 00:22 β€” πŸ‘ 2    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0
Citation File Format (CFF) Landing page for the Citation File Format (CFF), a YAML 1.2-based format for providing citation metadata for (research/scientific) software.

Seconding putting the code and requesting a DOI. You can also add a CITATION.cff file to your repository.
citation-file-format.github.io

Or put the code on zenodo.org, then you also get a DOI.

06.04.2025 17:04 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Framework for De novo Sequencing of Peptide Mixtures via Network Analysis and Two-Dimensional Tandem Mass Spectrometry chemrxiv.org/engage/...

---
#proteomics #prot-preprint

17.03.2025 14:20 β€” πŸ‘ 2    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0
Post image

DiNovo: high-coverage, high-confidence de novo peptide sequencing using mirror proteases and deep learning www.biorxiv.org/cont...

---
#proteomics #prot-preprint

25.03.2025 17:00 β€” πŸ‘ 3    πŸ” 2    πŸ’¬ 0    πŸ“Œ 0

DiNovo: high-coverage, high-confidence de novo peptide sequencing using mirror proteases and deep learning https://www.biorxiv.org/content/10.1101/2025.03.20.643920v1

25.03.2025 15:47 β€” πŸ‘ 4    πŸ” 2    πŸ’¬ 0    πŸ“Œ 0
Pairwise Attention: Leveraging Mass Differences to Enhance De Novo Sequencing of Mass Spectra

Pairwise Attention: Leveraging Mass Differences to Enhance De Novo Sequencing of Mass Spectra

Figure 1

Figure 1

Table 1

Table 1

Figure 2

Figure 2

Pairwise Attention: Leveraging Mass Differences to Enhance De Novo Sequencing of Mass Spectra [new]
Use mass differences to improve transformer networks for peptide sequencing by biasing the attention mechanism.

03.04.2025 20:35 β€” πŸ‘ 3    πŸ” 2    πŸ’¬ 0    πŸ“Œ 0

Google colab has free GPUs (T4). Instanovo now has a Hugging Face space for online predictions! CPUs still work for Casa & Insta (just slower🐒)

03.04.2025 07:44 β€” πŸ‘ 3    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0
Post image

#AI tools InstaNovo and InstaNovo+ decode hidden proteins missed by traditional methods, aiding #cancer research & understanding diseases.

These tools significantly outperform conventional protein detection, offering potential for new therapies.

Source: shorturl.at/6nGkc

#Health #Research

03.04.2025 09:14 β€” πŸ‘ 1    πŸ” 2    πŸ’¬ 0    πŸ“Œ 0
Preview
AI takes step towards cracking biology’s toughest problem – protein sequencing The team hopes the system will eventually be as influential as AlphaFold was for protein structure prediction

'β€˜InstaNovo … directly predicts the sequence from the spectrum, eliminating the need for database lookups (...) This is possible because our models have learned the underlying patterns of the sequences we are measuring and can translate a spectrum directly into the corresponding peptide sequences.’'

05.04.2025 09:57 β€” πŸ‘ 4    πŸ” 7    πŸ’¬ 1    πŸ“Œ 0
Preview
The AI revolution comes to protein sequencing New systems can identify unknown proteins in samples from diseased tissue, the environment, and archaeological sites

"Over the past 4 years, researchers have unveiled more than two dozen protein sequencing AIs. β€œIt seems clear that this is where the field is going to go,” says William Noble, a proteomics AI developer at the University of Washington."

04.04.2025 08:03 β€” πŸ‘ 4    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0
Preview
The AI revolution comes to protein sequencing New systems can identify unknown proteins in samples from diseased tissue, the environment, and archaeological sites

Considering how old the mass spec is in this picture , they likely need all the ai they can get to get good data out of that thing !

www.science.org/content/arti...

01.04.2025 12:14 β€” πŸ‘ 13    πŸ” 2    πŸ’¬ 3    πŸ“Œ 1

@jeroen.vangoey.be is following 20 prominent accounts