[9/9]
Appreciate any advice, pointers to relevant papers, or even βdonβt do thisβ cautionary tales.
Thanks in advance!
#transformers #sparsity #maskedmodeling #deeplearning #symbolicAI #mlresearch #attentionmodels #structureddata
@gautammalik.bsky.social
Research Assistant at University of Cambridge | Exploring deep learning in biology with big dreams of using AI to make drug discovery a little less complicated!π§¬π₯οΈ
[9/9]
Appreciate any advice, pointers to relevant papers, or even βdonβt do thisβ cautionary tales.
Thanks in advance!
#transformers #sparsity #maskedmodeling #deeplearning #symbolicAI #mlresearch #attentionmodels #structureddata
[8/9]
C) Local patching: training on smaller, denser subregions of the matrix
D) Contrastive or denoising autoencoder approaches instead of MLM
E) Treating the task as a kind of link prediction or structured matrix completion
F) Something entirely different?
[7/9]
B) Loss weighing:
Also tried tweaking the loss weights to prioritize correct prediction of rare relation types.
But it didnβt seem to help either, possibly because the model just ends up ignoring the background (.) altogether.
[6/9]
A) Biased masking:
I've tried biased masking (favoring the rare relation tokens), but itβs not helping much.
Can biased masking still work when the background class is that overwhelming? Or does it just drown out the signal anyway?
[5/9]
I'm wondering whether standard masked modeling is the right fit for this format, or whether it needs adjustment. Some options I'm exploring or considering:
[4/9]
The challenge:
The matrix is highly sparse, most entries are the neutral "no relation" token. When a meaningful relation is masked, it is often surrounded by these neutral tokens. I'm concerned that the model may struggle to learn meaningful context, since most of what it sees is neutral.
[3/9]
Iβm using a masked modeling objective, where random entries are masked, and the model learns to recover the original token based on the rest of the matrix. The goal is for the model to learn latent structure in how these symbolic relationships are distributed.
[2/9]
The relation types are drawn from a small, discrete vocabulary, and the "no relation" case is marked with a neutral symbol (e.g., .). Importantly, this "no relation" doesnβt necessarily mean irrelevance. Iβm not sure whether it should be treated as informative context or just noise.
[1/9]
Iβm working on a transformer-based model over a 2D symbolic matrix where each row and column represents elements from two discrete sets. Each cell contains a token representing a relationship type between the corresponding pair, or a default token when no known relation exists.
Quick question for anyone doing transformer stuff in comp bio/chem or structured data!
Trying out masked modeling on a sparse setup, but not sure I'm going about it right. Curious how others have tackled this.
The surprising ineffectiveness of molecular dynamics coordinates for predicting bioactivity with machine learning
Checks whether MD-derived 3D information helps for bioactivity and target predictions over just static 3D information. Often no 3D info at all is best..
P: chemrxiv.org/engage/chemr...
I like this perspectiveβit should be about evolution through iterations, rather than expecting the best-evolved algorithm right away. Everyone seems inspired by AlphaFoldβs story, looking for a similar breakthrough in their domain, but maybe the focus should be on steady progress.
09.12.2024 10:46 β π 1 π 0 π¬ 0 π 0Itβs definitely a challenging but fascinating area and Iβd love to talk more about this!
09.12.2024 05:30 β π 0 π 0 π¬ 0 π 0But it isnβt as trivial as it sounds, right? Is it just about using some vector embeddings from domain knowledge and adding them to the model? Iβm wondering, is it the lack of collaboration between scientists and AI/ML experts thatβs hindering this kind of development?
09.12.2024 05:14 β π 0 π 0 π¬ 1 π 0This debate might be intense, but itβs moments like this that make me curious about where weβre all headed. Science evolves through friction, right?
09.12.2024 04:57 β π 0 π 0 π¬ 0 π 0What excites me, though, is the idea I keep hearing: can we combine the best of both worlds? Is that even possible? Are we talking about something like machine-learned potentials in MD simulations, or is it deeper than that? Please, help me out to gain some more perspective!
09.12.2024 04:57 β π 1 π 0 π¬ 2 π 0As a young researcher, I canβt help but notice how scientists using physics-based methods can sometimes show a bias. Itβs clear they have their roots, but there's no denying that AI/ML methods come with their own set of caveats, many of which are tough to even recognize.
09.12.2024 04:57 β π 0 π 0 π¬ 1 π 0A young researcherβs perspective on the #DiffDock discussion between @gcorso.bsky.social and @prof-ajay-jain.bsky.social:
Honestly, Iβm feeling both thrilled and a little lost. As someone new to the field, I canβt help but reflect on what this means for the future of docking and AI/ML in science.
Impressive work by @franknoe.bsky.social and team! A pragmatic tour-de-force combining experimental and predicted protein structures, MD simulations and experimental stability data to sample conformational ensembles of proteins. Think AlphaFold, but capturing multiple free energy minima.
08.12.2024 11:21 β π 39 π 8 π¬ 1 π 0When the size of test data is 5 compounds and accuracy is 100% π
06.12.2024 21:15 β π 15 π 1 π¬ 1 π 0There's a new #RDKit blog post introducing some new functionality that I'm really excited about: doing efficient substructure and similarity searches in very large chemical libraries:
greglandrum.github.io/rdkit-blog/p...
#ChemSky
#CASP16 results are in! Template-based VFold seems to be lead method for nucleic acid structure prediction! AlphaFold2 and 3 still seem to be best methods for protein monomer and complex prediction.
30.11.2024 22:28 β π 86 π 23 π¬ 2 π 1 Itβs not real-world ready but a good foundation to explore. And yes, science does need a protein emoji!
github.com/gautammalik-...
To wrap up, Iβm curious about your thoughts on the future of docking models. Will the next breakthrough be GNN-based, transformer-based, or something like generative models (e.g., Diffusion)? I'd love to hear your opinions on what direction the field is heading. Let me know your thoughts!
22.11.2024 19:46 β π 0 π 1 π¬ 1 π 0 Itβs not real-world ready but a good foundation to explore. And yes, science does need a protein emoji!
github.com/gautammalik-...
Iβve created a GitHub repo for building a pre-trained BERT model on protein-ligand data. Itβs designed for those seeking a starting point for transformer architecture with protein-ligand complex data, as Iβve extensively focused on the math behind it.
22.11.2024 19:46 β π 0 π 0 π¬ 1 π 0To wrap up, Iβm curious about your thoughts on the future of docking models. Will the next breakthrough be GNN-based, transformer-based, or something like generative models (e.g., Diffusion)? I'd love to hear your opinions on what direction the field is heading. Let me know your thoughts!
22.11.2024 19:46 β π 0 π 1 π¬ 1 π 02. Iterative Updates:
The ligand atoms keep moving into formation, adjusting their positions iteratively.
3. Final Coordinates:
After several rounds, the model spits out the final 3D coordinates of the ligand atoms.
And there you have it, Dockformer in action!
1. Intra- and Intermolecular Modules:
Two types of attention layers come into play:
a. Intra-ligand attention: Helps the ligand atoms organize themselves correctly.
b. Ligand-protein cross-attention: Helps the ligand atoms adjust based on the proteinβs pocket geometry.
Step 4: The Grand Finale β Structure Generation
Now that Dockformer understands the molecular interactions, itβs time to predict the 3D coordinates of the ligand atoms. This happens in the structure module.
What happens here?