Pretty amazing looking tool for analysing genetic neighbourhoods:
github.com/chevrettelab...
@jameslingford.bsky.social
PhD student in structural biology with @greening.bsky.social and @knottrna.bsky.social at Monash Uni. (he/him) Interested in hydrogenases, evolution, protein design. π» https://www.jameslingford.com/
Pretty amazing looking tool for analysing genetic neighbourhoods:
github.com/chevrettelab...
π₯ Excited to introduce Bacformer π¦ - the first foundation model for bacterial genomics. Bacformer represents genomes as sequences of ordered proteins, learning the βgrammarβ of how genes are arranged, interact and evolve.
Preprint π: biorxiv.org/content/10.1...
π§΅ 1/n
Excited to share our work on WitChi! π οΈπ₯οΈ
We tested it on the GTDB r220 archaeal supermatrix (5,869 taxa & 10,101 cols) removing 55% of sites in <2h.
The phylogeny showed several interesting groupings with overall improved branch support:
#phylogenetics #ArchaeaSky #MSA #opensource #MEvoSky #MicroSky
New tgv release: local cache!
tgv download hg38
Download UCSC reference genomes to a local sqlite db for much faster browsing. Awesome Rust tools (twobit, bigtools) made this simple.
github.com/zeqianli/tgv
Logo for the Sandpiper website
Out in @natbiotech.nature.com: Metagenome taxonomy profilers usually ignore unknown species. SingleM is an accurate profiler which doesn't, even detecting phyla with no MAGs. Profiles of 700,000 metagenomes at sandpiper.qut.edu.au. A π§΅
16.07.2025 21:59 β π 126 π 70 π¬ 7 π 9Got this setup working where I can now run .ipynb notebooks right from inside the terminal with a combination of neovim, quarto, kitty, and this neovim plugin called molten: github.com/benlubas/mol...
Never have to abandon my precious vim setup again
Excited to share our latest work using AI-designed proteins to block heme-piracy by E. coli. Published in @natcomms.nature.com. A team effort between my lab and the βͺβͺ@knottrna.bsky.socialβ¬ β¬lab, with experimental work led by the talented @danielrfox.bsky.social
www.nature.com/articles/s41...
Been eagerly awaiting this one. Amazing work
07.07.2025 08:45 β π 4 π 0 π¬ 0 π 0We have written up a tutorial on how to run BindCraft, how to prepare your input PDB, how to select hotspots, and various other tips and tricks to get the most out of binder design!
github.com/martinpacesa...
Some matplotlib work in progress
27.06.2025 07:13 β π 2 π 0 π¬ 0 π 01/27 We have a new paper out! Turns out that snowflake yeast have been hiding a secret from us - they've evolved a (very!) crude circulatory system. Not with blood vessels or a heart, but through spontaneous fluid flows powered by their metabolism. π§ͺπ¬
www.science.org/doi/full/10....
Closer... I think at this point the solution lies in manually making a list of the hex codes, but that's for another day.
24.06.2025 07:58 β π 1 π 0 π¬ 0 π 0Revisiting this topic now that I've forced myself to use PyMOL. Using this script to install the viridis family of colour palettes: github.com/smsaladi/pym... and running '"spectrum count, palette=magma, MODEL_NAME". The palette is not there still. They must modify the magma palette somehow
23.06.2025 22:55 β π 4 π 0 π¬ 1 π 0How do we know for sure if we have the best AF prediction? We still need prior knowledge/expectations of what we're trying to predict with AF. And in the absence of that, I guess we can't really know unless we try every possible combination in the search space (which is not feasible).
20.06.2025 22:42 β π 1 π 1 π¬ 0 π 0Seen similar things where the PAEs of a multimer are poor until all the subunits in the correct stoichiometry are provided.
This worries me, because one could have a good prediction like "C", but miss out on the best prediction "D".
Physics-based design of efficient Kemp eliminases
@lynnkamerlin.bsky.social
www.nature.com/articles/s41...
Learning some Blender molecular nodes from @sarahjpiper.bsky.social @ccemmp-outreach.bsky.social
13.06.2025 04:03 β π 9 π 2 π¬ 0 π 0New paper from the lab from Sriram Garg in my group. We introduce a general substitution matrix for structural phylogenetics. I think this is a big deal, so read on below if you think deep history is important. academic.oup.com/mbe/advance-...
11.06.2025 14:01 β π 93 π 52 π¬ 3 π 2cat sortme.txt AAA foo AAA foo AAA foo AAA bar AAA bar BBB baz BBB baz CCC buzz sort sortme.txt | uniq -c | awk '{print $2,$1,$3}' | tee tmp | cut -f1 -d " " | uniq -c | join -o 1.1,1.2,2.2,2.3 -1 2 -2 1 - tmp | sort -k1,1n -k3,3nr
Little shell scripting solution to the problem of finding a group A in column1 of a table that is entirely made up of group B in column2, and sorting for the biggest homogeneous grouping.
11.06.2025 01:49 β π 0 π 0 π¬ 0 π 0I'm happy to announce the latest release of the GlobDB, available at globdb.org.
The GlobDB is a database of "species dereplicated" microbial genomes, and as of release 226 contains twice the number of species-representative genomes (306,260) than the latest GTDB release.
Proud to present the lab's latest work. The full structure of the sodium translocating methyltransferase (Mtr) bound to a small oxygen-responsive small protein MtrI. Work by Tristan and together with @schmitzstreitslab.bsky.social .
www.biorxiv.org/content/10.1101/2025.06.02.657420v1
Example of the fzf pop-up window for switching virtual environments
function con () { mamba deactivate mamba activate "$(mamba info --envs | fzf --color 'border:#94e2d5,label:#cdd6f4' --reverse --tmux center,40%,40% | awk '{print $1}')" }
I'm sick of typing "conda activate ENV" or "conda deactivate" to switch virtual environments. Now I've added a conda/mamba env switcher function that uses fzf to my ~/.zshrc.
03.06.2025 07:12 β π 2 π 0 π¬ 0 π 0Quick blogpost on how to convert the Genome Taxonomy Database (GTDB) protein faa reps into a DIAMOND database that includes taxonomic information and genome ID's
www.jameslingford.com/blog/gtdb-to...
Why make a cofactor when you can get it for free?
Our work, led by @fabianmunder.bsky.social, shows that bacteria from 22 phyla use the high-affinity transporter PqqU to obtain the redox cofactor PQQ from the environment as an alternative to cofactor synthesis.
www.science.org/doi/10.1126/...
NIH funding supporting the HMMER and Infernal software projects has been terminated. NIH states that our work, as well as all other federally funded research at Harvard, is of no benefit to the US.
22.05.2025 12:42 β π 286 π 232 π¬ 37 π 46Love this. Do you have any code that you could share to recreate this simulation?
25.05.2025 23:31 β π 1 π 0 π¬ 0 π 0awk '/^>/ {if (NR==1){print} else {printf "\n%s\n", $0}; next;} {printf "%s", $0}' input_file.faa
Awk command to unwrap wrapped lines in fasta files. It's just neat!
18.05.2025 11:47 β π 0 π 0 π¬ 0 π 0On this day, 10 years ago, my lab published the @nature.com paper reporting the discovery of the Asgard archaea (Lokiarchaeota at the time), revealing the archaeal nature of eukaryotic cells, and reshaping the Tree of Life. What a ride it has been since then... www.nature.com/articles/nat...
14.05.2025 13:48 β π 129 π 34 π¬ 7 π 1Very much agree with this call to action. Important for scientists to speak out together against all forms of tyranny, and not just speak out against funding cuts.
Though I realise that's easy for me to say, since I don't live in the U.S.
www.nature.com/articles/d41...
The energetic cost of cellular traits. Costs were calculated for individual proteins, for macromolecular complexes, or, in the case of Escherichia coli, for a single cell (shown for comparison). Calmodulin: 148 amino acids; ATP synthase: 4,955 amino acids (E. coli); dynein-2 motor: 12,084 amino acids (human, PDB ID: 6SC2); ribosome: 7,459 amino acids and 4,566 ribonucleotides (E. coli); vesicle: diameter 50 nm [membrane cost including proteins D 1.45 109 ATP mm 2 (86)]; nuclear pore complexes: 50β110 MDa (yeast and vertebrates) (2, 7); eukaryotic flagellum: 11 mm in length; and E. coli: estimated using the fourth cell budget method. Abbreviation: PDB ID, Protein Data Bank identifier.
Bioenergetics and the evolution of cellular traits [review]:
doi.org/10.1146/annu...