Ancient amino acid sets enable stable protein folds
Early proteins likely arose from a chemically limited set of amino acids available through prebiotic chemistry, raising a central question in molecular evolution: could such primitive compositions yie...
Can proteins fold and function with half of the amino acid alphabet?
Using only 10 residues, we designed stable, mutation-resilient structuresโno aromatics or basics involved.
A minimalist foundation for ancient biology and synthetic design. tinyurl.com/37t8br4v
#ProteinDesign #OriginsOfLife
03.11.2025 16:48 โ ๐ 25 ๐ 11 ๐ฌ 1 ๐ 0
Mingchen replied to me on Twitter that it's also on bioRxiv now www.biorxiv.org/content/10.6...
23.01.2026 15:41 โ ๐ 2 ๐ 0 ๐ฌ 0 ๐ 0
Mirdita Lab - Laboratory for Computational Biology & Molecular Machine Learning
Mirdita Lab builds scalable bioinformatics methods.
My time in @martinsteinegger.bsky.social's group is ending, but Iโm staying in Korea to build a lab at Sungkyunkwan University School of Medicine. If you or someone you know is interested in molecular machine learning and open-source bioinformatics, please reach out. I am hiring!
mirdita.org
20.01.2026 11:07 โ ๐ 105 ๐ 54 ๐ฌ 7 ๐ 1
Know when to co-fold'em
This is the official web page for the James Fraser Lab at UCSF.
I'm really excited to break up the holiday relaxation time with a new preprint that benchmarks AlphaFold3 (AF3)/โco-foldingโ methods with 2 new stringent performance tests.
Thread below - but first some links:
A longer take:
fraserlab.com/2025/12/29/k...
Preprint:
www.biorxiv.org/content/10.6...
29.12.2025 22:25 โ ๐ 72 ๐ 30 ๐ฌ 5 ๐ 2
New preprint๐จ
Imagine (re)designing a protein via inverse folding. AF2 predicts the designed sequence to a structure with pLDDT 94 & you get 1.8 ร
RMSD to the input. Perfect design?
What if I told u that the structure has 4 solvent-exposed Trp and 3 Pro where a Gly should be?
Why to be wary๐งต๐
16.12.2025 15:15 โ ๐ 57 ๐ 21 ๐ฌ 4 ๐ 1
Thanks, I didn't realize Rogue Scholar minted DOIs
12.12.2025 20:22 โ ๐ 0 ๐ 0 ๐ฌ 0 ๐ 0
Use @prereview.bsky.social for preprints and something else for other manuscripts?
12.12.2025 16:49 โ ๐ 1 ๐ 0 ๐ฌ 0 ๐ 0
What are good places to post an unsolicited manuscript peer review these days? I don't have a blog. I read manuscripts across arXiv, bioRxiv, ChemRxiv, OpenReview, random white papers, journals, etc. Do I dump it on Zenodo, post it here, and send it to the authors?
12.12.2025 16:49 โ ๐ 0 ๐ 0 ๐ฌ 2 ๐ 0
Our Assay2Mol manuscript was published at EMNLP 2025 doi.org/10.18653/v1/...
See the preprint thread below for a summary of the methodology, results, and code. We added more control experiments in this version related to protein sequence identity and generated molecule size.
21.11.2025 15:19 โ ๐ 0 ๐ 0 ๐ฌ 0 ๐ 0
@hkws.bsky.social and I are creating the Madison AI for Proteins (MAIP) group to discuss early-stage research at monthly meetups, share computational resources, and grow this local community. Visit mad-ai-proteins.github.io to sign up for announcements and watch for our 2026 events.
20.11.2025 16:29 โ ๐ 1 ๐ 0 ๐ฌ 0 ๐ 0
This looks like a fantastic resource to study human kinase signalling. So much MS instrument time.
19.11.2025 06:18 โ ๐ 13 ๐ 3 ๐ฌ 0 ๐ 0
Something fun and sciencey is coming soon to Madison
14.11.2025 17:20 โ ๐ 0 ๐ 0 ๐ฌ 0 ๐ 1
MPAC uses PARADIGM as the probabilistic model but makes many improvements:
- data-driven omic data discretization
- permutation testing to eliminate spurious predictions
- full workflow and downstream analyses in an R package
- Shiny app for interactive visualization
10.10.2025 14:56 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0
Overview of the MPAC workflow. MPAC calculates inferred pathway levels (IPLs) from real and permuted CNA and RNA data. It filters real IPLs using the permuted IPLs to remove spurious IPLs. Then, MPAC focuses on the largest pathway subset network with filtered IPLs to compute GO term enrichment, predict patient groups, and identify key group-specific proteins.
The journal version of our Multi-omic Pathway Analysis of Cells (MPAC) software is now out: doi.org/10.1093/bioi...
MPAC uses biological pathway graphs to model DNA copy number and gene expression changes and infer activity states of all pathway members.
10.10.2025 14:56 โ ๐ 2 ๐ 1 ๐ฌ 1 ๐ 0
Does anyone know whether there's a functioning API to ESMfold?
(api.esmatlas.com/foldSequence... gives me Service Temporarily Unavailable)
30.09.2025 14:11 โ ๐ 3 ๐ 1 ๐ฌ 2 ๐ 0
Fig. 6: Low-N GFP design.
We can use METL for low-N protein design. We trained METL on Rosetta simulations of GFP biophysical attributes and only 64 experimental examples of GFP brightness. It designed fluorescent 5 and 10 mutants, including some with mutants entirely outside training set mutations. 7/
11.09.2025 17:00 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0
Fig. 5: Function-specific simulations improve METL pretraining for GB1.
A powerful aspect of pretraining on biophysical simulations is that the simulations can be customized to match the protein function and experimental assay. Our expanded simulations of the GB1-IgG complex with Rosetta InterfaceAnalyzer improve METL predictions of GB1 binding. 6/
11.09.2025 17:00 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0
Fig. 3: Comparative performance across extrapolation tasks.
We also benchmark METL on four types of difficult extrapolation. For instance, positional extrapolation provides training data from some sequence positions and tests predictions at different sequence positions. Linear regression completely fails in this setting. 5/
11.09.2025 17:00 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0
Fig. 2: Comparative performance of Linear, Rosetta total score, EVE, RaSP, Linear-EVE, ESM-2, ProteinNPT, METL-Global and METL-Local across different training set sizes.
We compare these approaches on deep mutational scanning datasets with increasing training set sizes. Biophysical pretraining helps METL generalize well with small training sets. However, augmented linear regression with EVE scores is great on some of these assays. 4/
11.09.2025 17:00 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0
METL models pretrained on Rosetta biophysical attributes learn different protein representations than general protein language models like ESM-2 or protein family-specific models like EVE. These new representations are valuable for machine learning-guided protein engineering. 3/
11.09.2025 17:00 โ ๐ 1 ๐ 0 ๐ฌ 1 ๐ 0
Most protein language models train on natural protein sequence data and use the underlying evolutionary signals to score sequence variants. Instead, METL trains on @rosettacommons.bsky.social data, learning from simulated biophyiscal attributes of the sequence variants we select. 2/
11.09.2025 17:00 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0
Biophysics-based protein language models for protein engineering - Nature Methods
Mutational effect transfer learning (METL) is a protein language model framework that unites machine learning and biophysical modeling. Transformer-based neural networks are pretrained on biophysical ...
The journal version of "Biophysics-based protein language models for protein engineering" with @philromero.bsky.social is live! Mutational Effect Transfer Learning (METL) is a protein language model trained on biophysical simulations that we use for protein engineering. 1/
doi.org/10.1038/s415...
11.09.2025 17:00 โ ๐ 13 ๐ 2 ๐ฌ 1 ๐ 0
Computational biologist & computer scientist. Teacher, mentor, spouse, parent. Getting through life one step at a time.
Thoughts are my own. She/her.
optimization, inverse problems, also proteins. ml at escalante. formerly: atomicai, xgenomes, broad, berkeley.
Open source #bioinformatics at Sungkyunkwan University ๐ฐ๐ท | former Steinegger Lab @ SNU, Sรถding Lab @ MPI-NAT | http://mstdn.science/@milotmirdita
Computational Structural Biologist
Assistant Professor @VanderbiltMPB
wankowiczlab.com
(she/her)
Past: UCSF, Dana-Farber, Broad Institute, UMass Amherst
Solve difficult biological problems using ML/AI.
Cancer Risk Assessment - โCatch Cancer Earlyโ.
Perturbation Biology - computational cell biology - design new combo therapies.
Protein design - function and structure.
Computational chemist/structural bioinformatician working on improving molecular simulation at MRC Laboratory of Molecular Biology. jgreener64.github.io
Asst. Prof. Uni Groningen ๐ณ๐ฑ
Comp & Exp Biochemist, Protein Engineer, 'Would-be designer' (F. Arnold) | SynBio | HT Screens & Selections | Nucleic Acid Enzymes | Biocatalysis | Rstats & Datavis
https://www.fuerstlab.com
https://orcid.org/0000-0001-7720-9
Professor of EECS and Statistics at UC Berkeley. Mathematical and computational biologist.
assistant professor at Princeton University interested in biological and chemical data
Associate professor at ETH Zurich, studying the cellular consequences of genetic variation. Affiliated with the Swiss Institute of Bioinformatics and a part of the LOOP Zurich.
Discover the Languages of Biology
Build computational models to (help) solve biology? Join us! https://www.deboramarkslab.com
DM or mail me!
Biologist that navigate in the oceans of diversity through space-time
Protein evolution, metagenomics, AI/ML/DL
Website https://miangoaren.github.io/
Director of Institute for Computational Genomic Medicine at Goethe University Frankfurt https://cgm.uni-frankfurt.de/
Studying gene regulation and transcription factor binding with machine learning. Assoc Prof at Penn State. ๐ฎ๐ช
Computational biologist. Faculty @DukeU. Co-founder http://martini.ai. Prev @MIT_CSAIL. Did quant investing for a while, before returning to research.
https://singhlab.net
Chair of the Department of Biomedical Informatics at the University of Colorado School of Medicine. Research: transcriptomics, machine learning, public data - pick two of three. He/him. Views mine, not employer's.
Computational biology, machine learning, AI, RNA, cancer genomics. My views are my own. https://www.morrislab.ai
He/him/his
Developing data intensive computational methods โข PI @ Seoul National University ๐ฐ๐ท โข #FirstGen โข he/him โข Hauptschรผler
Lab studying molecular evolution of proteins and viruses. Affiliated with Fred Hutch & HHMI.
Opinions are my own and do not reflect those of my employer.
https://jbloomlab.org/