Clay Kosonocky's Avatar

Clay Kosonocky

@kosonocky.bsky.social

ML + Biochemistry PhD Candidate at UT Austin. BioML Society Founder. All problems are solvable, so let's solve some biomlsociety.org

748 Followers  |  293 Following  |  72 Posts  |  Joined: 11.11.2024  |  1.7139

Latest posts by kosonocky.bsky.social on Bluesky

Learn more at biomlsociety.org/challenge

02.09.2025 14:45 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Huge shoutout to everyone that helped put on this event :)
Alex Abel, Aaron Feller, Amanda Cifuentes Rieffer, Phillip Woolley, Daryl Barth, Tynan Gardner , Wesley Wierson, Andrew Ellington, & Edward Marcotte (@edwardmarcotte.bsky.social)

02.09.2025 14:45 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

Weโ€™re hoping to elucidate and share the trends in method choices to help our community hone in on standard practices for successful protein design :)

Stay tuned!

02.09.2025 14:45 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

The full data release, scores, and analysis will be coming soon. We wish we could share all of the stats right now but we think itโ€™s best if we release everything all at once when the publication hits preprint (hopefully in October!)

02.09.2025 14:45 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

Huge congrats to the winners! This is an absolutely mind-blowing feat and we canโ€™t believe that designs from this competition worked so well. Each of the four winners will be given a 3D print of their winning binder, some stylish LEAH Labs swag, and, of course, bragging rights!

02.09.2025 14:45 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

Weโ€™re also giving a second award to the team with the highest success rate of submitted sequences in the Round 1 screen. This team was Nucleate UK with a whopping 38% hit rate, with the runnerups being BindingIllini (15%) and Furman Lab (11%)

02.09.2025 14:45 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

Each design in the top ten succeeded at some aspect of T cell biology, but a few were successful in all of them! The teams with the top three designs were the Perez Lab Gators, Amigo Acids, and the Schoeder Lab. Each of these teams will be given an award ๐Ÿฅ‡

02.09.2025 14:45 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

Round 2 (Individual Functional Assays)

The ten best performing sequences from Round 1 were selected to move on to a series of individual assays measuring many aspects of T cell biology including cytotoxicity against CD20+ tumors, cytokine release, proliferation, and expansion

02.09.2025 14:45 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

This approach hijacks a necessary but not sufficient behavior of functional T cellsโ€”proliferationโ€”to identify which binders engaged the antigen, signaled through their CAR, and became overrepresented at the population level

02.09.2025 14:45 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

Round 1 (High-Throughput Screen)

The 12,000 CAR-T cell constructs were tested in a high-throughput pooled screen to measure how well the binders drove CAR-T cell proliferation in the presence of the target antigen CD20 compared to a control

02.09.2025 14:45 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

The binders functioned in the context of CAR-T cells, serving as the binding domain that helps the CAR-T recognize the cancer antigen CD20. To validate them, we did two rounds of testing:

02.09.2025 14:45 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

This was only possible thanks to LEAH Labs' high-throughput T cell engineering screen, DNA synthesis from
@twistbioscience.com, reagents from Lonza Group, ScaleReady, and VWR, and additional funding from the Center for Systems & Synthetic Biology at UT Austin

02.09.2025 14:45 โ€” ๐Ÿ‘ 1    ๐Ÿ” 1    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

We had 28 teams with competitors from 40 different countries design and submit a total of *12,000* peptide binders. All of the designs and methods, both in silico and in vitro, will be made public, and all parties agreed that no IP will be claimed to help further open science

02.09.2025 14:45 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

The methods used by scientists from *all over the world* resulted in antigen binders that plausibly act as cancer therapeutics. We canโ€™t get over how cool this is!

02.09.2025 14:45 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

Rather than test for just binding affinity, we tested these binders directly in the therapeutic modality to see how well AI protein design tools work towards a real-world scenario that could one day accelerate cell therapy prescription to days or weeks, rather than years

02.09.2025 14:45 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

๐ŸšจThe Bits to Binders Competition has concluded!๐Ÿงฌ

One year ago we gathered scientists from around the world to design and submit protein binders that cause immune cells to target and eliminate CD20+ tumors

Spoiler: They work!

02.09.2025 14:45 โ€” ๐Ÿ‘ 5    ๐Ÿ” 2    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 1

Perhaps this points to a need for non-anonymous online platforms with user ID verification and protocols to ensure content isn't AI-generated

We can still have our wild west anonymous internet, but IMO we also need a baseline of reality

17.05.2025 14:18 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

On the other hand this policy *clearly* reinforces echo chambers. "Anyone or anything that conflicts with my worldview is fake"

17.05.2025 14:18 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

On one hand, this is good because it shifts the burden of responsibility away from actual humans and puts them in higher esteem than the background internet content. "Surely no human would say such a thing. Must be a bot or a human influenced by bots"

17.05.2025 14:18 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

For better or worse, I've subconsciously adopted a policy of "AI/fake/bot unless proven innocent" w/ regards to internet content from anyone I don't know personally

17.05.2025 14:18 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image Post image

Some recent used bookstore acquisitions of mine

Part of a few themes I'm trying to understand:
- cultures & life via classic lit
- basics of philosophy & history
- modern political climate & trajectory
- how can we make cities better?
- what will the future look like?

17.05.2025 14:14 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Effect of SMILES representation on structure prediction, now with a large sample size!

12.02.2025 17:35 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

That said, I'm very thankful to chaidiscovery for releasing their model for people to tinker with (it really is great overall). Open models allow us all to find the quirks, and in the end everyone benefits from having more superior tools ๐Ÿ˜Ž

12.02.2025 17:34 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Now you must note (!) that this is just looking at the *confidences* between the various structures. This is not looking at the actual positioning. I did not check that here

Overall for most ligands there isn't a crazy difference, but the variability is, in fact, higher

12.02.2025 17:34 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image Post image Post image

And we see it more clearly in these plots.

Ideally if there were no effect, the experimental conditions (plots 2 & 3) would be as tight with the y=x as the 1st plot. But we can see that some ligands just don't behave as well

12.02.2025 17:34 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 1
Post image

Then Compute the avg absolute difference between the groups:

- s42 to s43 difference serves as our control
- s42 to oechem, and s43 to oechem serve as our experimental condition

We see from these numbers that there is a larger difference between rdkit and oechem

12.02.2025 17:34 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

three conditions:
1. RDKit canonicalized SMILES, seed 42
2. RDKit canonicalized SMILES, seed 43
3. OEChem canonicalized SMILES, seed 42

12.02.2025 17:34 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

Checked this phenomena over a larger sample size

Experiment:
- 148 ligands, same protein - 5x samples per prediction - avg iPTM calculated per structure (across the 5x samples)

12.02.2025 17:34 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

Yeah that was confusing wording on my part. What I meant was ensuring that the model sees more than one SMILES representation. You could choose a random SMILES order / scheme for your dataset that doesn't change each epoch, or you could shuffle the rep for each molecule every epoch

07.02.2025 16:13 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

And don't forget:

When in doubt, canonicalize SMILES!

04.02.2025 20:38 โ€” ๐Ÿ‘ 3    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

@kosonocky is following 20 prominent accounts