Gaia β Tatta Bio
We are building this infrastructure for the scientific community, and we invite feedback and collaboration from researchers at every stage. We are grateful to
the Moore Foundation for their generous support in making this project possible. Stay tuned for more updates!
www.tatta.bio/gaia
02.06.2025 16:23 β π 0 π 1 π¬ 0 π 0
Today's sequence data infrastructure is set up for failure in the age of AI.
Building an open and collaborative sequence platform for both Human and AI scientists.
At Tatta Bio, we have been thinking deeply about the sequence-to-function problem. We believe that before AI can power functional prediction, we first need to rethink how we curate, manage, and share sequence data. Here, we share our initial ideas on what we are building next:
02.06.2025 16:23 β π 7 π 4 π¬ 1 π 0
I am so grateful for all the support I received from my mentors, colleagues and collaborators over the years: @pgirguis.bsky.social, @sokrypton.org, @simrouxvirus.bsky.social, @alexjprobst.bsky.social, @annedekas.bsky.social
28.04.2025 14:57 β π 2 π 0 π¬ 1 π 0
Itβs been an incredible journey building Tatta Bio with @ancornman1.bsky.social to advance AI infrastructure for biology, and I will continue to further our mission as chief scientist.
28.04.2025 13:47 β π 2 π 0 π¬ 1 π 0
My lab will couple ML and high throughput experimentation to harness the remarkable functional diversity of microbial genomes. If you are excited about the intersection of AI and microbiology, please get in touch!
28.04.2025 13:47 β π 1 π 0 π¬ 1 π 0
Itβs official! π Iβm thrilled to announce that I will be joining MIT as an assistant professor in a shared appointment between Biology, EECS and Schwarzman College of Computing this fall.
28.04.2025 13:47 β π 65 π 3 π¬ 9 π 0
Job Board | Notion
Overview
Tatta Bio is growing! We are hiring *two positions* in Business Development and Software Engineering to lead the development of AI-enabled scientific software for open science and biological sequence interpretation. Please check out the job postings at www.tatta.bio/careers and share widely!
24.03.2025 16:29 β π 5 π 2 π¬ 0 π 0
Our thoughts too! (stay tunedπ) π
18.12.2024 22:37 β π 0 π 0 π¬ 0 π 0
As we improve Gaia Agent, we want to hear your feedback on the agent predictions. If you have suggestions on how we can increase its capabilities, please reach out! This was a major collaborative effort with @cong-ml.bsky.social , @joshuakravitz.com @nishantjha.org @ancornman1.bsky.social @Tatta Bio
17.12.2024 13:38 β π 7 π 1 π¬ 0 π 0
Gaia Agent: Context-Aware Functional Insights at Scale β Tatta Bio
An AI biologist discovers previously uncharacterized systems in the Mtb genome.
We tested Gaia Agent's capabilities with hypothetical genes in Mycobeterium tuberculosis. In our blog, We detail our in silico validation of Gaia Agent-predicted membrane transporter and lanthipeptide biosynthesis loci that were uncharacterized despite decades of Mtb research. Read more:
17.12.2024 13:38 β π 5 π 0 π¬ 1 π 0
Like a human biologist, Gaia Agent considers sequence, structure and genomic context to *think* about functions of novel genes, drastically accelerating our ability to predict functions of billions of unannotated proteins across the tree of life.
17.12.2024 13:38 β π 2 π 0 π¬ 1 π 0
Can LLM agents discover novel protein functions? Introducing Gaia Agent π π€: an AI biologist capable of reasoning across genomic contexts to predict functions of proteins! Gaia Agent is now integrated with Gaia Search at gaia.tatta.bio
17.12.2024 13:38 β π 38 π 13 π¬ 2 π 1
If you are at #NeurIPS2024 don't miss @ancornman1.bsky.social's talk on OMG/gLM2 at 9AM! @workshopmlsb.bsky.social East meeting room 11,12
15.12.2024 16:21 β π 12 π 3 π¬ 0 π 0
MIBiG 4.0: advancing biosynthetic gene cluster curation through global collaboration
Abstract. Specialized or secondary metabolites are small molecules of biological origin, often showing potent biological activities with applications in ag
Are you working on natural products? Weβve just released version 4.0 of the MIBiG data standard and repository! It now includes 3059 biosynthetic gene clusters, thanks to the combined efforts of 288 expert contributors. A thread: (1/8) academic.oup.com/nar/advance-...
10.12.2024 08:05 β π 92 π 53 π¬ 4 π 12
overview of results for PLAID!
1/𧬠Excited to share PLAID, our new approach for co-generating sequence and all-atom protein structures by sampling from the latent space of ESMFold. This requires only sequences during training, which unlocks more data and annotations:
bit.ly/plaid-proteins
π§΅
06.12.2024 17:44 β π 122 π 37 π¬ 1 π 4
you can search for eukaryotic sequences too, and you might find interesting homology to microbial proteins! (the current database you search against is microbial)
23.11.2024 23:39 β π 1 π 0 π¬ 0 π 0
Our Big Fantastic Virus Database (BFVD) is now published NAR! It contains protein structure predictions of major viral clades, enhanced by petabase-scale homology search and it's explorable on the web.
π bfvd.foldseek.com
πΎ bfvd.steineggerlab.workers.dev
π academic.oup.com/nar/advance-...
23.11.2024 21:12 β π 339 π 127 π¬ 6 π 5
Great question, translation tables 11 and 4 should be covered, and we have seen translation table 15 being accounted for in some cases. @apcamargo.bsky.social
22.11.2024 16:16 β π 1 π 0 π¬ 1 π 0
and we are live on biorxiv! bsky.app/profile/bior...
21.11.2024 22:50 β π 8 π 2 π¬ 0 π 0
Thank you! We are building additional features (e.g. bookmarks, tags, comments), stay tuned for updates!
19.11.2024 19:59 β π 0 π 0 π¬ 0 π 0
Great suggestion -- noted!
19.11.2024 18:51 β π 3 π 0 π¬ 0 π 0
We cluster all protein embeddings across all 100 retrieved contexts, and then the top 5 most frequently occurring clusters are colored!
19.11.2024 18:13 β π 4 π 0 π¬ 1 π 0
This is a fantastic resource! Yes this is possible and we plan on expanding our database in the coming months, it would make sense to include AllTheBacteria
19.11.2024 15:39 β π 6 π 0 π¬ 1 π 0
A huge shoutout to nishantjha.bsky.social, Joshua Kravitz, Jacob West-Roberts, apcamargo.bsky.social, simrouxvirus.bsky.social & Andre Cornman for awesome teamwork, & a big TY to those who participated in user interviews. Gaia is in active development so please reach out with ideas or suggestions.
19.11.2024 15:07 β π 6 π 0 π¬ 1 π 0
Check out our preprint www.tatta.bio/gaia-paper for benchmarking results. In our manuscript, we showcase how Gaia can be used for annotating uncharacterized #phage proteins and discovering putative biosynthetic gene clusters!
19.11.2024 15:07 β π 4 π 0 π¬ 1 π 0
Gaia
In order to make this search maximally interpretable, we built a web application that integrates existing tools (HMMer, sequence alignments, ESMFold) with genomic context visualizations. Gaia is freely available on gaia.tatta.bio, please share your feedback!
19.11.2024 15:07 β π 4 π 0 π¬ 2 π 0
Gaia searches the protein universe comprising 85M clusters across hundreds of thousands of microbial genomes. Embedding-based search takes ~0.2s per sequence, which is at least two orders of magnitude faster than BLASTp, allowing for real-time search.
19.11.2024 15:07 β π 3 π 0 π¬ 1 π 0
CEO of FutureHouse, building an AI Scientist
Researcher @hpi.bsky.social. AI for viral/microbial bioinformatics, bio/molecular design, biosecurity 𧬠Previously MIT CSAIL, Robert Koch Institute
Assistant Professor in CS + AI at USC. Previously at Stanford, CMU. Machine Learning, Decision Making, AI-for-Science, Generative AI, ML Systems, LLMs.
https://willieneis.github.io
So far I have not found the science, but the numbers keep on circling me.
Views my own, unfortunately.
Safe and robust AI/ML, computational sustainability. Former President AAAI and IMLS. Distinguished Professor Emeritus, Oregon State University. https://web.engr.oregonstate.edu/~tgd/
AI for Science, deep generative models, inverse problems. Professor of AI and deep learning @universitedeliege.bsky.social. Previously @CERN, @nyuniversity. https://glouppe.github.io
Parent, spouse, Australian, Professor of Machine Learning in Oxford. Long Covid, trans rights, music, reggae on Fridays, AI must be good for humans, https://www.robots.ox.ac.uk/~mosb
Recently a principal scientist at Google DeepMind. Joining Anthropic. Most (in)famous for inventing diffusion models. AI + physics + neuroscience + dynamical systems.
Assistant Professor at Duke in Biomedical Engineering (@dukeubme.bsky.social) and Biostatistics & Bioinformatics. Research focus on digital biomarker development. All views are my own.
Biomedical Informatics PhD β’ CITRIS Health @UC Berkeley β’ FAMIA β’ Focusing on Informatics and AI in medicine β’ Missoula MT
https://citris-uc.org/people/person/scott-mcgrath/
aneeshsathe.com
π§ͺπ§¬π»π€π¬π¦ π©Ίππ
physician-scientist, interested in AI safety/interpretability in biology/medicine. jjanizek.github.io
Assistant Professor at Stanford. Trustworthy, deployable ML/NLP for healthcare.
Professor of Chemistry and Computer Science
University of Toronto
Faculty member, Vector Institute
Director, Acceleration Consortium
Senior Director of Quantum Chemistry NVIDIA
My views expressed here are personal and are not those of my employers.
Assistant Professor in Computer Science, McGill University /
Mila Quebec AI Institute. Co-Founder and Chair, Climate Change AI. MIT Tech Review "Innovator Under 35". he/him/his
Computational chemist at the University of Copenhagen. Editor-in-Chief PeerJ Physical Chemistry. #compchem
Ad Astra Fellow, Asst. Prof., School of Chemistry, @ucddublin.bsky.social⬠|
Editor, @joss-openjournals.bsky.social |
Personal: espottesmith.github.io |
Research group (@coreacter.org): coreacter.org |
orcid.org/0000-0003-1554-197X |
All opinions mine
Chemistry professor at CMU. Connecting chemical sciences with AI #MachineLearning and automated experimentation. #tarheels fan. Care: #design, #photography #Ukraine #catsπ Rants are mine