@saramostafavi.bsky.social (@Genentech) & I (@Stanford) r excited to announce co-advised postdoc positions for candidates with deep expertise in ML for bio (especially sequence to function models, causal perturbational models & single cell models). See details below. Pls RT 1/
19.06.2025 20:55 β π 55 π 40 π¬ 1 π 3
Today was a big day for the lab. We had two back to back thesis defenses and the defenders defended with great science and character.
Congrats to DR. Kelly Cochran & DR. @soumyakundu.bsky.social on this momentous achievement.
Brilliant scientists with brilliant futures ahead. πππ
15.05.2025 05:19 β π 77 π 7 π¬ 2 π 0
Programmatic design and editing of cis-regulatory elements
The development of modern genome editing tools has enabled researchers to make such edits with high precision but has left unsolved the problem of designing these edits. As a solution, we propose Ledi...
Our preprint on designing and editing cis-regulatory elements using Ledidi is out! Ledidi turns *any* ML model (or set of models) into a designer of edits to DNA sequences that induce desired characteristics.
Preprint: www.biorxiv.org/content/10.1...
GitHub: github.com/jmschrei/led...
24.04.2025 12:59 β π 115 π 37 π¬ 2 π 3
Single cell β ENCODEHomo sapiens clickable body map
Very excited to announce that the single cell/nuc. RNA/ATAC/multi-ome resource from ENCODE4 is now officially public. This includes raw data, processed data, annotations and pseudobulk products. Covers many human & mouse tissues. 1/
www.encodeproject.org/single-cell/...
07.01.2025 21:29 β π 287 π 86 π¬ 6 π 0
Our ChromBPNet preprint out!
www.biorxiv.org/content/10.1...
Huge congrats to Anusri! This was quite a slog (for both of us) but we r very proud of this one! It is a long read but worth it IMHO. Methods r in the supp. materials. Bluetorial coming soon below 1/
25.12.2024 23:48 β π 231 π 89 π¬ 8 π 5
I think thatβll be interesting to look more into! The profile information does not convey overall accessibility since itβs normalized, but maybe this sort of multitasking could help.
14.12.2024 15:24 β π 1 π 0 π¬ 0 π 0
Thank you for the kind words! Yes, ChromBPNet uses unmodified models, which includes profile data and a bias model. However these evaluations use only the count head.
11.12.2024 06:14 β π 1 π 0 π¬ 1 π 0
Excited to announce DART-Eval, our latest work on benchmarking DNALMs! Catch us at #NeurIPS!
11.12.2024 02:30 β π 8 π 5 π¬ 0 π 1
New work! Come check out our poster tomorrow and take a look at the paper!
11.12.2024 02:30 β π 6 π 3 π¬ 0 π 0
(9/10) How do we train more effective DNALMs? Use better data and objectives:
β’ Nailing short-context tasks before long-context
β’ Data sampling to account for class imbalance
β’ Conditioning on cell type context
These strategies use external annotations, which are plentiful!
11.12.2024 02:30 β π 7 π 1 π¬ 1 π 0
(8/10) This indicates that DNALMs inconsistently learn functional DNA. We believe that the culprit is not architecture, but rather the sparse and imbalanced distribution of functional DNA elements.
Given their resource requirements, current DNALMs are a hard sell.
11.12.2024 02:30 β π 7 π 1 π¬ 1 π 0
(7/10) DNALMs struggle with more difficult tasks.
Furthermore, small models trained from scratch (<10M params) routinely outperform much larger DNALMs (>1B params), even after LoRA fine-tuning!
Our results on the hardest task - counterfactual variant effect prediction.
11.12.2024 02:30 β π 6 π 1 π¬ 3 π 0
(6/10) We introduce DART-Eval, a suite of five biologically informed DNALM evaluations focusing on transcriptional regulatory DNA ordered by increasing difficulty.
11.12.2024 02:30 β π 4 π 1 π¬ 1 π 0
(5/10) Rigorous evaluations of DNALMs, though critical, are lacking. Existing benchmarks:
β’ Focus on surrogate tasks tenuously related to practical use cases
β’ Suffer from inadequate controls and other dataset design flaws
β’ Compare against outdated or inappropriate baselines
11.12.2024 02:30 β π 2 π 0 π¬ 1 π 0
(4/10) An effective DNALM should:
β’ Learn representations that can accurately distinguish different types of functional DNA elements
β’ Serve as a foundation for downstream supervised models
β’ Outperform models trained from scratch
11.12.2024 02:30 β π 2 π 0 π¬ 1 π 0
(3/10) However, DNA is vastly different from text, being much more heterogeneous, imbalanced, and sparse. Imagine a blend of several different languages interspersed with a load of gibberish.
11.12.2024 02:30 β π 8 π 3 π¬ 1 π 0
(2/10) DNALMs are a new class of self-supervised models for DNA, inspired by the success of LLMs. These DNALMs are often pre-trained solely on genomic DNA without considering any external annotations.
11.12.2024 02:30 β π 3 π 0 π¬ 1 π 0
Scientist and medical doctor. Biology AI/ML methods, gene regulation, DNA sequence models, single cells. Doing a PhD in computational biology at @molgen.mpg.de.
Dr. Nemer Finotelo β CRM-SC 19776 / RQE 28115.
Titulado em Endocrinologia e Metabologia (RQE 28115).
PΓ³s-Graduado em Nutrologia.
PΓ³s-Graduado em Nutrologia Funcional .
PΓ³s-Graduado em Emagrecimento e Obesidade.
Agenda:
https://wa.me/message/PRK326R2OLGLD1
Industry Scientist in SD | Sequencing + Cells
π¦ Postdoc βͺ@jcvi.org
π§ PhD @utah.eduβ¬
Postdoctoral fellow, Huazhong University of Science and Technology #bioinformatics
FutureHouse Fellowship - Sergey Ovchinnikov and Aaron Schmidt labs! Previously MIT EECS PhD with Debora Marks. Talk to me about modeling viral/immune proteins! π¦
PhD student at Fraticelli lab @irbbarcelona.orgβ¬
I try to convert coffee into code and ideas for uncovering biological mysteries.
Some interests: single-cell, lineage tracing, computational biology, cellular variability, premalignancy, resistance...
Computational biologist / Bioinformatician / Baker
Interested in gene expression regulation, stats, and data visualization
Living in the limbo of automating processes on the computer and crafting stuff in the material world
https://jaimicore.github.io/
BioE PhD candidate at Stanford
Senior lecturer in Computational Biophysics @ University of Edinburgh molecular simulations, machine learning and baking enthusiasts. Occasional mathematics outreach
Core Investigator @ Arc Institute | Associate Professor @ UCSF | {Computational, Systems, Cancer, RNA} biologist | Co-founder @exaibio @vevo_ai
Assistant Professor at UC Berkeley and UCSF.
Machine Learning and AI for Healthcare. https://alaalab.berkeley.edu/
Encode AI for Science Fellow (Pillar VC / Imperial College London) Protein Design, PhD in machine learning for structural biology at UCL
Developing ML methods for Drug Discovery @ Novartis π§¬π» | prev: PhD student @ University of Edinburgh with Toni Mey
PhD candidate at MIT CSAIL. Generative models, protein design.
Ex: DeepMind, Microsoft, Instagram, Johns Hopkins University.
Website: https://people.csail.mit.edu/jyim/
X: https://x.com/json_yim
PhD student @EPFL π¨π
ML & computational biology π€π§¬βοΈ
Professor in bioinformatics, Stockholm University. Protein structure lover ( interactions predictions, evolution ..). Using machine learning as a part of AI for Sciences for halv my life. In addition to succΓ©n, I sometimes rants about sailing or skiing.
Computational biophysicist, et al. Opinions expressed are my own.
Also @giorginolab@mstdn.science on Mastodon.
Systems immunologist at Yale, @yalecsei.bsky.socialβ¬, @czbiohub.bsky.socialβ¬
Me: https://medicine.yale.edu/profile/john-tsang/
Lab: https://www.tsanglab.org
Center: https://medicine.yale.edu/systems-engineering-immunology/
Science is the best use of ML
PhD Student @CMUPittCompBio.bsky.social / @SCSatCMU.bsky.social
Interested in ML for science/Compuational drug discovery/AI-assisted scientific discovery π€
from π±π°π«Ά https://new.ramith.fyi