Jeremie Kalfon πŸ‘¨β€πŸ’»πŸ§¬πŸ€–πŸš€'s Avatar

Jeremie Kalfon πŸ‘¨β€πŸ’»πŸ§¬πŸ€–πŸš€

@jkobject.com.bsky.social

Doing a Ph.D. AI in Bio. | Ex @WhiteLabGx @BroadInstitute @MIT | Built @PiPleteam | ML, Cancer, Genomics, Data Sci, Entrepreneur, FullStack Dev | All views are mine

682 Followers  |  3,257 Following  |  118 Posts  |  Joined: 06.11.2024  |  1.8765

Latest posts by jkobject.com on Bluesky

Of course, but you first need the info. Right now it is like driving a car blindfolded...

24.07.2025 14:06 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

We then spend hundreds of billions treating what could have been avoided.

Why aren’t we doing this by default?
2/2

24.07.2025 11:44 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

We then spend hundreds of billions treating what could have been avoided.

Why aren’t we doing this by default?
2/2

24.07.2025 11:44 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

In 2025, deciding to have a child without full genome sequencing of both parents is borderline reckless.

It costs under €400. Takes 2 minutes. Could save your child’s life.

Yet >200 million people live with rare genetic diseasesβ€”many preventable by this simple test.
1/2

24.07.2025 11:44 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

In 2025, deciding to have a child without full genome sequencing of both parents is borderline reckless.

It costs under €400. Takes 2 minutes. Could save your child’s life.

Yet >200 million people live with rare genetic diseasesβ€”many preventable by this simple test.
1/2

24.07.2025 11:44 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

I am at ISMB ECCB this week πŸ§¬πŸ‘¨β€πŸ’»Do reach out if you are in Liverpool and want to chat about AI, target discovery and disease modelling 😊

21.07.2025 11:47 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

If you are at ICML, I would be happy to meet to talk about AI, bio, and drug discovery!

17.07.2025 17:08 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Lots of work remain: (1) We only show this ability on proteinβ†’cells. (2) We haven’t used other fine tuning methods than adapter layers for now.

I would love to talk about these ideas with people and will be at ICML in Vancouver and ISMB/ECCB in Liverpool! ✈️

20.06.2025 09:07 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

We show that this helps the models learn faster, achieve better results on many test metrics and create better representations.

This is an early proof of concept toward this grand goal of modeling life across scales.

20.06.2025 09:06 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

By using common fine-tuning mechanism we show how one can train from one scale to the next by back-propagating signal to the compressed tokens and lower scale model.

20.06.2025 09:06 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

By using cross attention and an auto-encoding mechanism we present XPressor, a framework that creates compressed tokens from a scale (e.g. proteins) that can be used as inputs tokens by the scale above (e.g. cells)

20.06.2025 09:06 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Each group of biologist are on their own niche and so too are the models. But These models talk about different steps of the same stair.

We present ideas on how we might end up training models from atoms to organs by using transformers to compress πŸ”ΊΒ πŸ”» data into tokens used by larger scale models

20.06.2025 09:06 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Very happy to share that my new paper got accepted at the ICML workshop for Foundation model for Life Sciences!!
www.biorxiv.org/cont...

Foundation Models are being trained from atoms to molecules βš›οΈ, molecule chains 🧬, entire cells 🦠, and even groups of cell across tissue slices 🫁

20.06.2025 09:06 β€” πŸ‘ 0    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0
Preview
scPRINT: pre-training on 50 million cells allows robust gene network predictions Nature Communications - Authors present a state-of-the-art cell foundation model trained on 50 million cells. They show that the model generates a meaningful gene network and has zero-shot...

Read more about scPRINT in our Nature Communications paper: www.nature.com/artic... πŸ‘“

I will be at ICML and ISMB/ECCB this year to chat about upcoming upgrades and AI in Biology! ✈️

17.06.2025 21:06 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
scPRINT | v1.0 | Virtual Cells Platform scPRINT is a cell foundation model, also called a Large Cell Model (LCM), trained on single-cell RNA sequence (scRNAseq) data from more than 50M human and mouse cells available through CZ CELLxGENE. Based on the transformer architecture, the model is fully open source and reproducible, with multiple checkpoint sizes available from 2M to 100M parameters. scPRINT demonstrated high performance for genome-wide cell-specific gene network inference when benchmarked against state-of-the-art models (e.g., scGPT, Geneformer v2, GENIE3). In addition, scPRINT has various zero-shot capabilities, including cell embedding, cell label prediction (e.g., cell type, sex, disease), and gene expression imputation, highlighting its potential as a versatile tool for single-cell analysis.

scPRINT is now finally on the Chan Zuckerberg Institute's Model Hub! πŸŽ‰Β πŸ§¬Β πŸŒˆ It is one more way you can use this cell foundation model to embed, denoise, predict cell type, get gene networks from your data from scratch, or fine-tune it on your own application / usecase: virtualcellmodels.cz...

17.06.2025 21:06 β€” πŸ‘ 4    πŸ” 1    πŸ’¬ 2    πŸ“Œ 0
Preview
scPRINT: Gene Network Inference from 50M Cells by JΓ©rΓ©mie Kalfon - Genbio AI TL;DR In this talk, JΓ©rΓ©mie Kalfon presents his paper β€œscPRINT: pre-training on 50 million cells allows robust gene network predictions” at the Foundation Models for Biology Seminar Series by GenBio AI. He introduces scPRINT, a transformer-based foundation model trained on over 50 million single-cell RNA-seq profiles to infer gene networks. scPRINT enables scientists to predict

Thanks to @GenBioAI to let me present my work on scPRINT and to have synthetised it so expertly in their blog post:
genbio.ai/scprint-ge...

Watch the full presentation here: www.youtube.com/watc...

13.06.2025 07:21 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
scDataset: Scalable Data Loading for Deep Learning on Large-Scale... Modern single-cell datasets now comprise hundreds of millions of cells, presenting significant challenges for training deep learning models that require shuffled, memory-efficient data loading....

I see people want to compete with scDataLoader...

arxiv.org/abs/2506.0...

07.06.2025 18:31 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

You often want to generate faithful organoids, but human embryonic developmental data is scarce, mostly mouse data is available.

Therefore, you need to map the organoid to the mouse data to determine if your organoid cells behave as expected.
2/2

07.06.2025 16:19 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

The first convincing example I saw of why one would want to map across species:

www.biorxiv.org/cont...
1/2

07.06.2025 16:19 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
MIA: Ellen Zhong, ML for reconstructing structural landscapes from cryoEM; Primer, Rishwanth Raghu
Models, Inference and Algorithms March 11, 2025 Broad Institute of MIT and Harvard Primer: Heterogeneous reconstruction in cryo-EM Rishwanth Raghu Princeton University Department of Computer Science Meeting: Machine learning for visualizing structural landscapes inside the cell Ellen Zhong Princ MIA: Ellen Zhong, ML for reconstructing structural landscapes from cryoEM; Primer, Rishwanth Raghu

cool cryoEM+ML talk

Showing the future of the protein structure data modality and by doing-so, where structural models like AF3 will go next and how they might be trained

www.youtube.com/watc...


07.06.2025 15:18 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
scPRINT: pre-training on 50 million cells allows robust gene network predictions | JΓ©rΓ©mie Kalfon
Portal is the home of the AI for drug discovery community. Join for more details on this talk and to connect with the speakers: https://portal.valencelabs.com/logg A cell is governed by the interaction of myriads of macromolecules. Such a network of interaction has remained an elusive milestone in scPRINT: pre-training on 50 million cells allows robust gene network predictions | JΓ©rΓ©mie Kalfon

(I’m definitely not switching careers to become a YouTuber anytime soon.)

For those in the field looking for a deeper dive into the technical details, here's a more advanced version of the talk, recorded at LOGG: youtu.be/s9_DZz9E1To...
2/2

23.05.2025 08:54 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

As part of my research, I believe that scientific outreach is essential. Last week, I had the pleasure of presenting how AI can help us understand biology and the cell at Pint of Science 2025!

I also put together a short video recap (in French) for those curious: youtu.be/fc8L8Dn_7tw...
1/2

23.05.2025 08:54 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
scPRINT: pre-training on 50 million cells allows robust gene network predictions | JΓ©rΓ©mie Kalfon
Portal is the home of the AI for drug discovery community. Join for more details on this talk and to connect with the speakers: https://portal.valencelabs.com/logg A cell is governed by the interaction of myriads of macromolecules. Such a network of interaction has remained an elusive milestone in scPRINT: pre-training on 50 million cells allows robust gene network predictions | JΓ©rΓ©mie Kalfon

(I’m definitely not switching careers to become a YouTuber anytime soon.)

For those in the field looking for a deeper dive into the technical details, here's a more advanced version of the talk, recorded at LOGG: youtu.be/s9_DZz9E1To...
2/2

23.05.2025 08:43 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

As part of my research, I believe that scientific outreach is essential. Last week, I had the pleasure of presenting how AI can help us understand biology and the cell at Pint of Science 2025!

I also put together a short video recap (in French) for those curious: youtu.be/fc8L8Dn_7tw...
1/2

23.05.2025 08:43 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Great to see my work featured on the CNRS website!

Read more at www.nature.com/articles/s41...

19.05.2025 08:18 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

I feel that arc browser and the browser company is really an example of how a founder can get brain-wormed by crazy investors up to the point of leaving the only value proposition of its project and go back to the masses of AI branded company with nothing to sell other than marketing. So sad..

23.04.2025 22:54 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Welcome! You are invited to join a meeting: scPRINT: A Transformer Model for Mapping Gene Networks in Single Cells. After registering, you will receive a confirmation email about joining the meeting. Join Superbio and the researchers behind scPRINT, a large transformer model trained on over 50 million single cells. Designed to infer gene networks without supervision, scPRINT outperforms existing tools and shows strong performance in denoising, batch effect correction, and cell type prediction. We’ll explore how this model was built, how it works, and what it reveals about aging and inflammation in prostate tissue. Perfect for researchers in single-cell biology, computational biology, and AI in life sciences.

Come hear and chat about scPRINT, a foundation model of single-cell data that can generate gene networks, annotate cell types and more! Tomorrow (hosted by @superbioai)
us06web.zoom.us/meet...

22.04.2025 17:20 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Open-AI's O3 model is magic. Can't wait for more tool use and faster reasoning.

21.04.2025 12:01 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

In China there are institutes devoted to using AI, and sequencing technologies, to advance Chinese traditional medicine 🀯

tcmx.tsinghua.edu.cn...

21.04.2025 12:00 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Thanks!

With our hierarchical classifier the model achieves great accuracy on predicting over a thousand labels like cell types, age, sequencer, ethnicity, disease, sex :)

I would be happy to see how people could extend it further!

17.04.2025 16:34 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

@jkobject.com is following 20 prominent accounts