GitHub - THGLab/HiQBind: Workflow to clean up and fix structural problems in protein-ligand binding datasets
Workflow to clean up and fix structural problems in protein-ligand binding datasets - THGLab/HiQBind
Check it out, and feel free to drop your questions here or on GitHub!
π GitHub: github.com/THGLab/HiQBind
π Paper: pubs.rsc.org/en/content/a...
#machinelearning #proteinligand #proteinligandbindingaffinity #structuralbiology #ai4sci
07.04.2025 23:49 β π 0 π 0 π¬ 0 π 0
GitHub - THGLab/HiQBind: Workflow to clean up and fix structural problems in protein-ligand binding datasets
Workflow to clean up and fix structural problems in protein-ligand binding datasets - THGLab/HiQBind
Huge shoutout to the amazing team behind this:
π Lead author Yingze (Eric) Wang and @kunyangsun.bsky.social
π PI Prof. Teresa Head-Gordon
π Teammates Jie Li, Xingyi Guan, Oufan Zhang, Dorian Bagni
π Collaborators Dr. Heather A. Carlson and Prof. Yang Zhang
07.04.2025 23:49 β π 1 π 0 π¬ 1 π 0
GitHub - THGLab/HiQBind: Workflow to clean up and fix structural problems in protein-ligand binding datasets
Workflow to clean up and fix structural problems in protein-ligand binding datasets - THGLab/HiQBind
Whatβs next?
Weβre exploring:
π Rotamer refinement
π€ Binding label extraction with LLMs (maybe π)
π§ Better data splits (possibly inspired by PLINDER) to support ML research!
07.04.2025 23:49 β π 0 π 0 π¬ 1 π 0
GitHub - THGLab/HiQBind: Workflow to clean up and fix structural problems in protein-ligand binding datasets
Workflow to clean up and fix structural problems in protein-ligand binding datasets - THGLab/HiQBind
Since we're focused on structural data with binding labels, we applied this workflow to major open-access datasets (BioLiP, BindingDB, and BindingMOAD) to generate HiQBind: a cleaned, corrected dataset comparable in size to PDBBind v2020 but with significantly improved structural quality! π₯
07.04.2025 23:49 β π 1 π 0 π¬ 1 π 0
GitHub - THGLab/HiQBind: Workflow to clean up and fix structural problems in protein-ligand binding datasets
Workflow to clean up and fix structural problems in protein-ligand binding datasets - THGLab/HiQBind
In this work, we built HiQBind-workflow, a semi-automated workflow that processes proteinβligand structures from the RCSB PDB by adding missing atoms, correcting ligand geometries, fixing bond orders and protonation states, and much more!
07.04.2025 23:49 β π 0 π 0 π¬ 1 π 0
Copying Oliver's post from Linkedin to help us gain some visibility here!
π¨ Our paper is out! π¨
"A workflow to create a high-quality proteinβligand binding dataset for training, validation, and prediction tasks" is now published in Digital Discovery! π
07.04.2025 23:49 β π 2 π 1 π¬ 1 π 0
We have left X for greener pastures and bluer skies - good riddance to Nazi-saluting Musk and his abuse of science and engineering that in fact enriched him.
20.03.2025 00:43 β π 6 π 0 π¬ 1 π 0
Associate Professor @OIST, our group is interested in understanding the origin of protein function. mom, climber, skier, co-founder of @rozforum.bsky.social
Flagship journal for @rsc.org, open access with no publication fee, all topics in chemistry. chemicalscience-rsc@rsc.org
Celebrating our 15th year in 2025! π Website: rsc.li/chemscience
Science writer and author of books including Bright Earth, The Music Instinct, Beyond Weird, How Life Works.
Ad Astra Fellow, Asst. Prof., School of Chemistry, @ucddublin.bsky.social⬠|
Editor, @joss-openjournals.bsky.social |
Personal: espottesmith.github.io |
Research group (@coreacter.org): coreacter.org |
orcid.org/0000-0003-1554-197X |
All opinions mine
Associate Professor of Biochemistry at Virginia Tech. Comp Chem, CADD, and Drude FF contributor. Opinions my own. Go Hokies!
Computational chemistry and biophysics. Empirical force fields, CADD, RNA, carbohydrates, and methods developments including SILCS (site identification by ligand competitive saturation, See SilcsBio LLC) and enhanced-sampling GCMC. And I surf....
Quantum chemist at Cardiff
Tries to pass as a computational chemist. Bostero fanatic. He/Him. Comments are my own and do not reflect my institution.
Professor #compbiophys #compchem @UT_Dallas
QM/MM force fields DNA rep./modif., cancer biomarkers, ionic liquids, #SACNAS, views my own, he/Γ©l π²π½πΊπΈ
πͺπΈ
Chemist β’ Professor at University of Vienna β’ Theoretical chemistry, quantum chemistry, excited-state dynamics β’
Home of SHARC π¦
Senior Editor of ACS Central Science β’
Board of Directors, Cluster of Excellence MECS βοΈ
She/her/wife/Mom of 2
Ab initio materials simulations at Duke University. Associate professor, Duke MEMS & Chemistry. https://aims.pratt.duke.edu/ . Opinions my own. Unlike the shirt on the right side of the profile image, which is not my own. The number on the shirt is 10.
Physicist, professor @univie.ac.at, director of @esivienna.bsky.social, computational physics, statistical mechanics, machine learning, soft matter, biking, hiking, skiing, *320 ppm. πͺπΊ
Professor, Georgia Tech Chemistry and Biochemistry. Georgia Research Alliance Eminent Scholar. Senior Editor, Protein Science. American, British, Swedish, Kurdish. Pianist, photographer, foodie, dreamer, mom. She/they. #FirstGen #MENA All views my own
Computational chemist at the University of Copenhagen. Editor-in-Chief PeerJ Physical Chemistry. #compchem
Research group lead by @bussigio.bsky.social at SISSA | #RNA, #moleculardynamics, and more | web: http://bussilab.org
Professor @pittchem.bsky.social. Computational biophysicist, leading @westpasoftware.bsky.social development for weighted ensemble rare-event sampling, Amber force field developer.
Computational Materials Science. Assistant Professor at UC Berkeley.
Assistant Professor at UC Berkeley
Research scientist & computational chemist at Berkeley Lab using HT DFT workflows, machine learning, and reaction networks to model complex reactivity.