Congratulations to John Clarke, Michel Devoret and John Martinis on receiving the 2025 Nobel Prize in Physics!
www.nobelprize.org/prizes/physi...
I have fond memories of my time in the Clarke lab, where I did my Honors Thesis on ultra low-field MRI w/ SQUIDs as an undergrad at UC Berkeley!
07.10.2025 14:16 β π 5 π 0 π¬ 0 π 0
Check out a Research Highlights on our work at @naturemethods by Lin Tang!
www.nature.com/articles/s41...
19.09.2025 16:36 β π 7 π 0 π¬ 0 π 0
Richard Bonneau giving the last keynote on navigating the complexity of drug discovery and their lab-in-the-loop for molecule design! #MLCB
11.09.2025 17:40 β π 2 π 0 π¬ 0 π 0
First talk a (surprise) keynote by Jacob Schreiber from UMass Medical talking about fruit-themed AI tools for understanding and designing regulatory DNA
11.09.2025 13:44 β π 3 π 0 π¬ 0 π 0
2025 MLCB day 2 is starting now!
Streaming live now!
m.youtube.com/watch?v=PxlXNbβ¦
11.09.2025 13:42 β π 3 π 0 π¬ 1 π 0
Now Barbara Engelhardt giving a keynote on characterizing behaviors of modified T cells in live cell imaging data using machine learning!
10.09.2025 17:58 β π 4 π 0 π¬ 0 π 0
Next talk by Courtney Shearer who is talking about genomic language models for zero shot promoter indel effects!
10.09.2025 15:15 β π 1 π 0 π¬ 1 π 0
Next talk by Alan Murphy and Masayuki (Moon) Nagai (from my lab!) who are talking about how naive fine-tuning genomic DNNs leads to catastrophic forgetting and propose *iterative causal refinement* to improve learned associations to causal understanding of cis-regulatory biology!
10.09.2025 14:53 β π 0 π 0 π¬ 1 π 0
Next talk by Johannes Linder at Calico. Talking about expanding genomic seq2fun DNNs with RBP binding and RNA processing data to consider post-transcriptional regulation.
10.09.2025 14:38 β π 1 π 0 π¬ 1 π 0
Some technical delays but we are all set!
First talk by Alexis Battle! @alexisbattle.bsky.social
10.09.2025 13:52 β π 5 π 0 π¬ 1 π 0
MLCB - Schedule
The in-person component will be held at the New York Genome Center, 101 6th Ave, New York, NY 10013. All times below are Eastern Time.
2025 Machine Learning in Computational Biology (#MLCB) meeting starts TODAY (9/10) at 9:30a (EST) at the NY Genome Center in NYC!
We have a great lineup of keynotes, contributed talks, and posters today and tomorrow
Schedule: mlcb.org/schedule
Join for free via livestream: m.youtube.com/@mlcbconf
10.09.2025 11:42 β π 13 π 7 π¬ 1 π 3
Here's another unpublished result:
We compared probing strategies to assess how informative the pretrained representations areβbenchmarking Evo2 vs D3 on Drosophila enhancer activity measured via STARR-seq.
Again, D3 outperforms Evo2 (and one-hot) across all probing methods!
16.07.2025 12:17 β π 2 π 0 π¬ 0 π 0
But, when we trained D3 (score-entropy discrete diffusion for regulatory DNA) in an unsupervised manner on the genomic sequences, probing the representations of D3 was comparable to supervised SOTA (even with a basic CNN)! (100M parameters vs 40B parameters)
16.07.2025 12:17 β π 2 π 0 π¬ 1 π 0
*Easter egg alert* NOT in the published paper. We also benchmarked Evo 2 and while it did better than other gLMs (consistent that scale can improve gLMs), it still falls short of a basic CNN trained using one-hot sequences and far short of supervised SOTA
16.07.2025 12:16 β π 26 π 5 π¬ 1 π 0
Also, my perspective is coming from gLMs applied to human genomes. I think they have a lot of potential for small compact genomes that don't have as layered regulation as higher-order eukaryotes.
16.07.2025 12:15 β π 2 π 0 π¬ 0 π 0
gLMs provide promise in learning structure in the genome, but we need to rethink how we either tokenize the genome (and no byte pair encoding isn't the answer either) or come up with a better masking strategy for non-coding genome that is different from other regions (eg coding).
16.07.2025 12:15 β π 3 π 0 π¬ 1 π 0
Tokenizing nucleotides/kmers and treating each token equally is like injecting lots of random words between every word in a sentence and hope that a LLM will learn the structure of the english language.
16.07.2025 12:14 β π 1 π 0 π¬ 1 π 0
It's unclear whether standard NLP-based objectives (MLM or CLM) will bring us to the promised land.
Unlike proteins, which have conservation at sequence and covariation levels, non-coding genome is conserved at functional level -- lots of drift and uninformative positions!
16.07.2025 12:14 β π 2 π 0 π¬ 1 π 0
There are many great applications for gLMs -- I'm not just a hater. The central dogma (or whatever that is being sold) is not one of them.
In terms of non-coding genome regulation (outside of splice sites) in humans, there is a huge uphill battle.
16.07.2025 12:13 β π 2 π 0 π¬ 1 π 0
Breaking the constant propagation of pointless gLM benchmarks in the ML field (that are disconnected from how biologists will use them) is what is giving gLMs unwarranted hype. The field must rally around useful applications of gLMs.
16.07.2025 12:13 β π 2 π 0 π¬ 1 π 0
Our benchmark is far from complete! It shows how current gLMs struggle in zero-shot capabilities for cell-type specific regulation. Think about all the differential regulation across cell types being projected onto a single genome -- this is hard to learn w/o functional data!
16.07.2025 12:13 β π 1 π 0 π¬ 0 π 0
Our benchmark is far from complete! It shows how current gLMs struggle in zero-shot capabilities for cell-type specific regulation. Think about all the differential regulation across cell types being projected onto a single genome -- this is hard to learn w/o functional data!
16.07.2025 12:13 β π 1 π 0 π¬ 1 π 0
This went 3 rounds of review in another journal, but 1 reviewer was adamant that this type of benchmark might be harmful to the burgeoning gLM field, which currently only benchmarks relative performance on (nearly) useless benchmarks in the non-coding regions. It was rejected!
16.07.2025 12:12 β π 1 π 0 π¬ 2 π 0
One thing that really bothers me with the new "virtual cell" terminology is that it is currently largely focused on a very narrow definition of models that can predict effects of trans perturbations (gene dosage, drugs etc) on gene expression. 1/
28.06.2025 10:38 β π 105 π 30 π¬ 1 π 0
Excited to launch our AlphaGenome API goo.gle/3ZPUeFX along with the preprint goo.gle/45AkUyc describing and evaluating our latest DNA sequence model powering the API. Looking forward to seeing how scientists use it! @googledeepmind
25.06.2025 14:29 β π 219 π 82 π¬ 5 π 10
This a really exciting leap forward for genomic sequence to activity gene regulation models. It is a genuine improvement over pretty much all SOTA models spanning a wide range of regulatory, transcriptional and post-transcriptional processes. 1/
25.06.2025 16:18 β π 72 π 20 π¬ 2 π 2
Congrats @avsecz.bsky.social! Looking forward to exploring what it has learned! π§¬
25.06.2025 17:41 β π 5 π 0 π¬ 0 π 0
@pkoo562.bsky.social Peter Koo at #AIxBio
23.06.2025 11:07 β π 10 π 3 π¬ 1 π 0
scientist at UC Berkeley inventing advanced genomic technologies
lover of molecules, user of computers
https://scholar.google.com/citations?user=63ZRebIAAAAJ&hl=en
Postdoc at @uwgenome.bsky.social with @wnoble.bsky.social
PhD from UIUC
BA from @reed.edu
I do ML on all kinds of biological data
Also, maybe I'll share some of my photography? We'll see...
Our long-term research goal is to understand and predict gene regulation based on DNA sequence information and genome-wide experimental data.
PhD Student @ UW Genome Sciences
Scientist at IMP in Vienna. Excited about gene expression regulation and its encoding in our genomes - enhancers, transcription factors, co-factors, silencers, AI.
PhD student at Koo lab at CSHL
Assistant Professor, Cold Spring Harbor Laboratory
Systems immunology to understand thymus physiology and T cell development
@RingAScientist, @SkypeScientist
KMILOT Scholar @ UBC | Research Tech III @ de Boer lab | Incoming PhD in Biomedical Engineering
PhD student at Wellcome Sanger Institute associated with Haniffa, Parts and Saez-Rodriguez Labs
PhDing at the Sanger Institute, i'm evolving every day
πΌ isabellezane.bio
Discover the Languages of Biology
Build computational models to (help) solve biology? Join us! https://www.deboramarkslab.com
DM or mail me!
BME & CS @Duke | Protein Designer @ColumbiaMed
Director of Institute for Computational Genomic Medicine at Goethe University Frankfurt https://cgm.uni-frankfurt.de/
Computational biologist; Associate Prof. at University of Wisconsin-Madison; Jeanne M. Rowe Chair at Morgridge Institute
Doctoral Researcher at Helmholtz Munich | Data Science - CompBio - Genomics & RNA Biology
Postdoc with Xiaowei Zhuang; spatial transcriptomics and computational biology
Assistant Professor of Genetics @ Yale.
Studying how variation in cis-regulatory-elements impacts evolution, complex traits, and more!
http://reilly-lab.com
Computational Scientist at The Jackson Laboratory | ML and CRE design