Learn more at biomlsociety.org/challenge
02.09.2025 14:45 โ ๐ 0 ๐ 0 ๐ฌ 0 ๐ 0@kosonocky.bsky.social
ML + Biochemistry PhD Candidate at UT Austin. BioML Society Founder. All problems are solvable, so let's solve some biomlsociety.org
Learn more at biomlsociety.org/challenge
02.09.2025 14:45 โ ๐ 0 ๐ 0 ๐ฌ 0 ๐ 0Huge shoutout to everyone that helped put on this event :)
Alex Abel, Aaron Feller, Amanda Cifuentes Rieffer, Phillip Woolley, Daryl Barth, Tynan Gardner , Wesley Wierson, Andrew Ellington, & Edward Marcotte (@edwardmarcotte.bsky.social)
Weโre hoping to elucidate and share the trends in method choices to help our community hone in on standard practices for successful protein design :)
Stay tuned!
The full data release, scores, and analysis will be coming soon. We wish we could share all of the stats right now but we think itโs best if we release everything all at once when the publication hits preprint (hopefully in October!)
02.09.2025 14:45 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0Huge congrats to the winners! This is an absolutely mind-blowing feat and we canโt believe that designs from this competition worked so well. Each of the four winners will be given a 3D print of their winning binder, some stylish LEAH Labs swag, and, of course, bragging rights!
02.09.2025 14:45 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0Weโre also giving a second award to the team with the highest success rate of submitted sequences in the Round 1 screen. This team was Nucleate UK with a whopping 38% hit rate, with the runnerups being BindingIllini (15%) and Furman Lab (11%)
02.09.2025 14:45 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0Each design in the top ten succeeded at some aspect of T cell biology, but a few were successful in all of them! The teams with the top three designs were the Perez Lab Gators, Amigo Acids, and the Schoeder Lab. Each of these teams will be given an award ๐ฅ
02.09.2025 14:45 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0Round 2 (Individual Functional Assays)
The ten best performing sequences from Round 1 were selected to move on to a series of individual assays measuring many aspects of T cell biology including cytotoxicity against CD20+ tumors, cytokine release, proliferation, and expansion
This approach hijacks a necessary but not sufficient behavior of functional T cellsโproliferationโto identify which binders engaged the antigen, signaled through their CAR, and became overrepresented at the population level
02.09.2025 14:45 โ ๐ 1 ๐ 0 ๐ฌ 1 ๐ 0Round 1 (High-Throughput Screen)
The 12,000 CAR-T cell constructs were tested in a high-throughput pooled screen to measure how well the binders drove CAR-T cell proliferation in the presence of the target antigen CD20 compared to a control
The binders functioned in the context of CAR-T cells, serving as the binding domain that helps the CAR-T recognize the cancer antigen CD20. To validate them, we did two rounds of testing:
02.09.2025 14:45 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0This was only possible thanks to LEAH Labs' high-throughput T cell engineering screen, DNA synthesis from
@twistbioscience.com, reagents from Lonza Group, ScaleReady, and VWR, and additional funding from the Center for Systems & Synthetic Biology at UT Austin
We had 28 teams with competitors from 40 different countries design and submit a total of *12,000* peptide binders. All of the designs and methods, both in silico and in vitro, will be made public, and all parties agreed that no IP will be claimed to help further open science
02.09.2025 14:45 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0The methods used by scientists from *all over the world* resulted in antigen binders that plausibly act as cancer therapeutics. We canโt get over how cool this is!
02.09.2025 14:45 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0Rather than test for just binding affinity, we tested these binders directly in the therapeutic modality to see how well AI protein design tools work towards a real-world scenario that could one day accelerate cell therapy prescription to days or weeks, rather than years
02.09.2025 14:45 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0๐จThe Bits to Binders Competition has concluded!๐งฌ
One year ago we gathered scientists from around the world to design and submit protein binders that cause immune cells to target and eliminate CD20+ tumors
Spoiler: They work!
Perhaps this points to a need for non-anonymous online platforms with user ID verification and protocols to ensure content isn't AI-generated
We can still have our wild west anonymous internet, but IMO we also need a baseline of reality
On the other hand this policy *clearly* reinforces echo chambers. "Anyone or anything that conflicts with my worldview is fake"
17.05.2025 14:18 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0On one hand, this is good because it shifts the burden of responsibility away from actual humans and puts them in higher esteem than the background internet content. "Surely no human would say such a thing. Must be a bot or a human influenced by bots"
17.05.2025 14:18 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0For better or worse, I've subconsciously adopted a policy of "AI/fake/bot unless proven innocent" w/ regards to internet content from anyone I don't know personally
17.05.2025 14:18 โ ๐ 2 ๐ 0 ๐ฌ 1 ๐ 0Some recent used bookstore acquisitions of mine
Part of a few themes I'm trying to understand:
- cultures & life via classic lit
- basics of philosophy & history
- modern political climate & trajectory
- how can we make cities better?
- what will the future look like?
Effect of SMILES representation on structure prediction, now with a large sample size!
12.02.2025 17:35 โ ๐ 0 ๐ 0 ๐ฌ 0 ๐ 0That said, I'm very thankful to chaidiscovery for releasing their model for people to tinker with (it really is great overall). Open models allow us all to find the quirks, and in the end everyone benefits from having more superior tools ๐
12.02.2025 17:34 โ ๐ 1 ๐ 0 ๐ฌ 0 ๐ 0Now you must note (!) that this is just looking at the *confidences* between the various structures. This is not looking at the actual positioning. I did not check that here
Overall for most ligands there isn't a crazy difference, but the variability is, in fact, higher
And we see it more clearly in these plots.
Ideally if there were no effect, the experimental conditions (plots 2 & 3) would be as tight with the y=x as the 1st plot. But we can see that some ligands just don't behave as well
Then Compute the avg absolute difference between the groups:
- s42 to s43 difference serves as our control
- s42 to oechem, and s43 to oechem serve as our experimental condition
We see from these numbers that there is a larger difference between rdkit and oechem
three conditions:
1. RDKit canonicalized SMILES, seed 42
2. RDKit canonicalized SMILES, seed 43
3. OEChem canonicalized SMILES, seed 42
Checked this phenomena over a larger sample size
Experiment:
- 148 ligands, same protein - 5x samples per prediction - avg iPTM calculated per structure (across the 5x samples)
Yeah that was confusing wording on my part. What I meant was ensuring that the model sees more than one SMILES representation. You could choose a random SMILES order / scheme for your dataset that doesn't change each epoch, or you could shuffle the rep for each molecule every epoch
07.02.2025 16:13 โ ๐ 1 ๐ 0 ๐ฌ 1 ๐ 0And don't forget:
When in doubt, canonicalize SMILES!