The STARLING has landed - excited to share new work from @bornanovak.bsky.social and @jefflotthammer.bsky.social in what is also Borna's first Bluesky post!
If you're at BPS, I'll be speaking about this this afternoon in IDP SG, and Borna and Jeff both have posters (Sunday B112 and Wed B152).
It was a pleasure to work closely together with Borna on this work. It’s super exciting pushing beyond what was possible with ALBATROSS alone. Like usual, we strive very hard to make sure our tools are not only as accurate as possible but also widely accessible for both experts and non-experts.
Importantly, STARLING is an open-source tool targeting ease of use and widespread availability. STARLING is available to install and run locally or online through a simple interface via Google Colab (github.com/idptools/sta...).
We also show how one can integrate STARLING with protein design tools to build de novo disordered protein sequences with target ensemble properties.
STARLING can be used to develop hypotheses as to how an IDR’s sequence may determine its conformational ensemble and/or how it may influence interactions with other IDRs.
STARLING dramatically lowers the barrier to the computational interrogation of IDR function through the lens of emergent biophysical properties in addition to traditional bioinformatic approaches.
We benchmark STARLING against decades of elegant biophysical research of disordered proteins, including smFRET, SAXS, and NMR experiments, and find that STARLING displays remarkable agreement.
STARLING produces high-quality predictions at a blazingly fast rate on GPUs and Apple Silicon and is still performant on CPUs.
We formulate IDR ensemble construction as a process of generating instantaneous distance maps in a sequence-conditioned manner, where each map represents a structure based on pairwise inter-residue distances.
STARLING is a latent denoising diffusion model inspired by recent progress in text-to-image generative models.
STARLING presents a generalization of this recent work by enabling the generation of IDR ensembles from which any observable and its distribution can be computed.
While previous deep learning approaches have focused on predicting average values for some subset of observables (e.g. end-to-end distance), they are limited by which observables have predictive models.
STARLING is a collaborative project spearheaded by @jefflotthammer.bsky.social and myself which builds upon the lab’s foundational work of IDR conformational ensemble property prediction directly from sequence.
STARLING is a generative model designed for the accurate prediction of coarse-grained disordered protein conformational ensembles.
Excited to announce the newest member of the flock - STARLING (conSTruction of intrinsicAlly disoRdered proteins ensembles efficientLy vIa multi-dimeNsional Generative models).
www.biorxiv.org/content/10.1...