Koushik's Avatar

Koushik

@koushikn.bsky.social

Data Scientist Generative AI @BayerCropScience. ML for Plant Biology. PhD @IowaStateUniversity https://www.linkedin.com/in/koushik-nagasubramanian/

807 Followers  |  2,512 Following  |  1 Posts  |  Joined: 16.11.2024  |  1.7871

Latest posts by koushikn.bsky.social on Bluesky

Post image Post image

In 1965, Margaret Dayhoff published the Atlas of Protein Sequence and Structure, which collated the 65 proteins whose amino acid sequences were then known.

Inspired by that Atlas, today we are releasing the Dayhoff Atlas of protein sequence data and protein language models.

25.07.2025 22:05 β€” πŸ‘ 64    πŸ” 28    πŸ’¬ 3    πŸ“Œ 3

please add me too. I work on ML for Plant Biology

28.06.2025 18:11 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

An assessment of DNA language models concludes:
◼️ They do not offer compelling gains over baseline models

Their performance is inconsistent and requires much more compute.

arxiv.org/abs/2412.05430

23.06.2025 20:21 β€” πŸ‘ 52    πŸ” 19    πŸ’¬ 1    πŸ“Œ 3

Our structural core gene pipeline Unicode is now published at GBE
πŸ“„ doi.org/10.1093/gbe/...

Please also check out @dongwookkim.bsky.social’s
🧡 bsky.app/profile/dong...

03.06.2025 17:19 β€” πŸ‘ 43    πŸ” 19    πŸ’¬ 2    πŸ“Œ 0
"A cacao tree with fruit pods in various stages of ripening. Taken on the Big Island (Hawaii) in the botanical gardens."
"Chocolate is created from the cocoa bean. A cacao tree with fruit pods in various stages of ripening."
Photo by Medicaster, Wikimedia

"A cacao tree with fruit pods in various stages of ripening. Taken on the Big Island (Hawaii) in the botanical gardens." "Chocolate is created from the cocoa bean. A cacao tree with fruit pods in various stages of ripening." Photo by Medicaster, Wikimedia

The only reason you love chocolate is because of FUNGUS.

Cacao seeds contain high amounts of polyphenols, making them intensely bitter & unpleasant. There are two natural fungi that do the heavy lifting in turning them into chocolate.

Let's do a quick tour of the process of chocolate making.

26.05.2025 21:18 β€” πŸ‘ 502    πŸ” 127    πŸ’¬ 13    πŸ“Œ 18

Three BioML starter packs now!

Pack 1: go.bsky.app/2VWBcCd
Pack 2: go.bsky.app/Bw84Hmc
Pack 3: go.bsky.app/NAKYUok

DM if you want to be included (or nominate people who should be!)

03.12.2024 03:27 β€” πŸ‘ 147    πŸ” 60    πŸ’¬ 16    πŸ“Œ 6
Post image

AFESM: a metagenomic guide through the protein structure universe! We clustered 821M structures (AFDB&ESMatlas) into 5.12M groups; revealing biome-specific groups, only 1 new fold even after AlphaFold2 re-prediction & many novel domain combos. 🧡
🌐 afesm.foldseek.com
πŸ“„ www.biorxiv.org/content/10.1...

27.04.2025 00:13 β€” πŸ‘ 141    πŸ” 71    πŸ’¬ 4    πŸ“Œ 4
Preview
Leveraging genomic deep learning models for non-coding variant effect prediction The majority of genetic variants identified in genome-wide association studies of complex traits are non-coding, and characterizing their function remains an important challenge in human genetics. Gen...

Super excited to share our review on genomic deep learning models for non-coding variant effect prediction, with Ayesha Bajwa and Nilah Ioannidis. We’d like this review to be a useful resource, and welcome any feedback, comments, or questions! 1/4

arxiv.org/abs/2411.11158

20.11.2024 01:31 β€” πŸ‘ 32    πŸ” 12    πŸ’¬ 1    πŸ“Œ 1
Overview of SAE methodology and representative SAE features revealed through automated activation
pattern analysis

Overview of SAE methodology and representative SAE features revealed through automated activation pattern analysis

Using mechanistic interpretability to steer generations

Using mechanistic interpretability to steer generations

SAE feature analysis and visualizations reveal features with diverse and consistent activation patterns

SAE feature analysis and visualizations reveal features with diverse and consistent activation patterns

Mechanistic interpretability on a protein language model

www.biorxiv.org/content/10.1...

18.11.2024 22:17 β€” πŸ‘ 48    πŸ” 15    πŸ’¬ 1    πŸ“Œ 0

Two BioML starter packs now:

Pack 1: go.bsky.app/2VWBcCd
Pack 2: go.bsky.app/Bw84Hmc

DM if you want to be included (or nominate people who should be!)

18.11.2024 17:09 β€” πŸ‘ 119    πŸ” 58    πŸ’¬ 10    πŸ“Œ 11
Preview
Uncertainty-aware genomic deep learning with knowledge distillation Deep neural networks (DNNs) have advanced predictive modeling for regulatory genomics, but challenges remain in ensuring the reliability of their predictions and understanding the key factors behind t...

DEGU distills an ensemble of models into a single model, retaining the ensemble’s predictive performance while providing uncertainty estimates - ie both epistemic (or model) and aleatoric (or data) uncertainty.

Led by @zrcjessica

Paper: www.biorxiv.org/content/10.1...

2/n

16.11.2024 16:14 β€” πŸ‘ 12    πŸ” 2    πŸ’¬ 1    πŸ“Œ 1
Preview
Ultrafast classical phylogenetic method beats large protein... Amino acid substitution rate matrices are fundamental to statistical phylogenetics and evolutionary biology. Estimating them typically requires reconstructed trees for massive amounts of aligned...

Large protein language models can learn complex epistatic interactions, but how much does that help with predicting variant effects? In this NeurIPS article, we show that classical independent-sites phylogenetic models can outperform pLMs on this task.
1/7
openreview.net/forum?id=H7m...

16.11.2024 20:41 β€” πŸ‘ 92    πŸ” 44    πŸ’¬ 2    πŸ“Œ 2
Post image

Thrilled to announce Boltz-1, the first open-source and commercially available model to achieve AlphaFold3-level accuracy on biomolecular structure prediction! An exciting collaboration with Jeremy, Saro, and an amazing team at MIT and Genesis Therapeutics. A thread!

17.11.2024 16:20 β€” πŸ‘ 610    πŸ” 205    πŸ’¬ 18    πŸ“Œ 25

I tried to make a bioml starter pack. DM if you want me to add or remove you?

go.bsky.app/2VWBcCd

11.11.2024 23:45 β€” πŸ‘ 91    πŸ” 39    πŸ’¬ 29    πŸ“Œ 6

@koushikn is following 20 prominent accounts