New @biorxiv-synthbio.bsky.social on #Evo π‡οΈπ§΅ introducing Evo 1.5 for semantic mining + SynGenome - an AI-generated genomics database #AI #synbio #LLM𧬠@adititm.bsky.social @brianhie.bsky.social et al. @arcinstitute.org
22.12.2024 19:23 β π 11 π 3 π¬ 3 π 0
If youβre interested in learning more or have any questions or feedback, definitely reach out! The preprint, along with a link to the PDF (since bioRxiv seems to be having some server issues) are linked below! N/N
www.biorxiv.org/content/10.1...
evodesign.org/Semantic_Min...
19.12.2024 18:54 β π 3 π 0 π¬ 0 π 0
This work was a massive collaborative effort between my amazing fellow graduate students Samuel King and Eric Nguyen! And of course, none of this would have happened without the incredible mentorship of @brianhie.bsky.social! Very fortunate to work with such inspiring scientists daily :) 13/N
19.12.2024 18:54 β π 1 π 0 π¬ 1 π 0
Ultimately, this study suggests that biological sequence models may be able to nontrivially generalize beyond known evolutionary space and that prompt engineering can be a valuable tool for steering generation towards desired functional outcomes. 12/N
19.12.2024 18:54 β π 0 π 0 π¬ 1 π 0
SynGenome
100 billion base pairs of AI-generate genomic sequence
SynGenome is publicly available at evodesign.org/syngenome/. You could use SynGenome to find diversified natural proteins, functionally characterize uncharacterized genes, or find highly divergent proteins with potentially conserved functions. Weβre excited to see what the community can find! 11/N
19.12.2024 18:54 β π 0 π 0 π¬ 1 π 0
To generate SynGenome, we used prompts derived from the genes encoding prokaryotic proteins in UniProt, reasoning that the resultant generations may be enriched for functions related to the proteins the prompts were derived from. 10/N
19.12.2024 18:54 β π 1 π 0 π¬ 1 π 0
Finally, to apply semantic mining to generate functional genes from across prokaryotic biology, we developed SynGenome, a database containing over 120 billion base pairs of synthetic DNA sequences. 9/N
19.12.2024 18:54 β π 1 π 0 π¬ 1 π 0
Despite this high diversity, 17% of the Acr designs we tested were functional. Additionally, many of our experimentally validated Acrs had low confidence AF3 structure predictions and two eluded significant structural or sequence characterization, making them akin to βde novoβ genes (!) 8/N
19.12.2024 18:54 β π 1 π 0 π¬ 1 π 0
We then applied semantic mining to see if we could design new anti-CRISPR (Acr) proteins, a highly diverse group of proteins with limited sequence or structural conservation thought to sometimes emerge via de novo gene birth. 7/N
19.12.2024 18:54 β π 2 π 0 π¬ 1 π 0
Half of the Evo-designed antitoxins we experimentally tested were functional (!), with most possessing only remote homology to natural proteins and some appearing to neutralize diverse toxin classes. 6/N
19.12.2024 18:54 β π 1 π 0 π¬ 1 π 0
We then applied semantic mining to generate a multi-gene bacterial toxin-antitoxin (TA) system. Using context from known TA systems as prompts, we first designed and experimentally validated a toxin gene. This toxin gene then served as a prompt for Evo to generate new conjugate antitoxins. 5/N
19.12.2024 18:54 β π 1 π 0 π¬ 1 π 0
As an initial test, we first demonstrated that Evo 1.5, a new version of Evo with extended pretraining, was able to understand genomic context, showing that it could complete highly conserved genes and operons when prompted with only partial sequences. 4/N
19.12.2024 18:54 β π 1 π 0 π¬ 1 π 0
Taking inspiration from genome mining techniques using guilt-by-association, we hypothesized that by prompting Evo with a gene encoding a desired function, we could guide the model to generate a new gene with a related function. We term this approach βsemantic mining.β 3/N
19.12.2024 18:54 β π 1 π 0 π¬ 1 π 0
Just as words derive meaning from their context, DNA gains functional significance within the context of genes, operons, and genomes. In prokaryotes, genes with related functions are often grouped together in close proximity on the DNA sequence. 2/N
19.12.2024 18:54 β π 1 π 0 π¬ 1 π 0
Excited to have the first project of my PhD out!! By leveraging genomic language model Evoβs ability to learn relationships across genes (i.e., "know a gene by the company it keeps"), we show that we can use prompt-engineering to generate highly divergent proteins with retained functionality. π§΅1/N
19.12.2024 18:54 β π 19 π 5 π¬ 1 π 1
Senior editor @science.org. Molecular biology, #DNA, #RNA, #gene regulation, #epigenetics, nuclear biology, #chromatin biology, 3D #genome, #synbio, #CRISPR and gene editing, other bacterial immune systems, and #AI in all these
Undergrad research intern @riley-research.bsky.social and @balynzaro.bsky.social #glycotime #TeamMassSpec
https://www.ocf.berkeley.edu/~vrt
Group leader at University of Sheffield, photosynthesis π±βοΈ, plant physiology, biochemistry and structural biology π¬πͺπΊπ¬π§ www.Sheffield.ac.uk/photosynthesis
Group leader at @cnb-csic.bsky.social. Systems biology, complex systems, and spatiotemporal phenomena in life.
All opinions and tweets my own, personal responsibility.
Podcast (Spanish): @lavacaesferica.bsky.social
ORCID: 0000-0001-6214-4083
IZTECH-Chemistry, I like to be Polymath(learning different subjects), Polygot(interested in the learning different languages) comments and opinions are my own RTβ is not endorsement (he/him)
SynBio and stuff. Currently interested in cellular immunotherapies and AI/ML apps in Bio. Postdoc
Tech CEO, science podcaster, investor, runner, cyclist. Solving temperature control (www.grantinstruments.com). Interviewing biotech founders on "The Big Experiment" https://thebigexperiment.buzzsprout.com/2312519
Neural crest biologist interested in broad and deep questions about clockworks of nature.
Gene Levinson, Ph.D. | Evolutionary Theorist
Research: Updated Evolutionary Synthesis framework
Founder, CognitoSymbiosis LLC (Research)
cognitosymbiosis.com
#Evolution #Science #Research
π§ͺππ
Assistant Professor at Georgia Tech BME - Synthetic biology and tissue engineering expert
Post-doc in synthetic biology of lactic acid bacteria π¦ π§¬
πRagon Institute πΊπΈ
#Biosecurity, #Biodefense, #Biotech, Pandemic Preparedness
I will always be a biochemist first and foremost, though my job these days is to get robots to run science experiments.
Also, pie, skewers, coffee, and beer make the world go round.
Developmental/stem cell biologist, group leader at Stockholm University. I study embryonic tissue patterning and shaping.
Also, music and abstract strategy games!
Made and raised in Thailand, then set free to the open world.
Now a postdoc at MIT.
Industry scientist,
Mass spectrometry & proteomics enjoyer
Plant & Microbial Synthetic Biologist | Music Lover & Coffee Drinker | Assistant Professor University of Georgia Department of Plant Biology & Institute of Bioinformatics | https://www.dundaslab.com
I work on data at Arzeda the protein design company.