James Lingford's Avatar

James Lingford

@jameslingford.bsky.social

PhD student in structural biology with @greening.bsky.social and @knottrna.bsky.social at Monash Uni. (he/him) Interested in hydrogenases, evolution, protein design. πŸ’» https://www.jameslingford.com/

148 Followers  |  226 Following  |  51 Posts  |  Joined: 10.01.2025  |  1.871

Latest posts by jameslingford.bsky.social on Bluesky

Preview
GitHub - chevrettelab/gator-gc Contribute to chevrettelab/gator-gc development by creating an account on GitHub.

Pretty amazing looking tool for analysing genetic neighbourhoods:
github.com/chevrettelab...

23.07.2025 04:37 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

πŸ’₯ Excited to introduce Bacformer 🦠 - the first foundation model for bacterial genomics. Bacformer represents genomes as sequences of ordered proteins, learning the β€œgrammar” of how genes are arranged, interact and evolve.

Preprint πŸ“: biorxiv.org/content/10.1...

🧡 1/n

21.07.2025 09:55 β€” πŸ‘ 91    πŸ” 58    πŸ’¬ 3    πŸ“Œ 2

Excited to share our work on WitChi! πŸ› οΈπŸ–₯️
We tested it on the GTDB r220 archaeal supermatrix (5,869 taxa & 10,101 cols) removing 55% of sites in <2h.

The phylogeny showed several interesting groupings with overall improved branch support:
#phylogenetics #ArchaeaSky #MSA #opensource #MEvoSky #MicroSky

20.07.2025 16:58 β€” πŸ‘ 30    πŸ” 11    πŸ’¬ 1    πŸ“Œ 0
Preview
GitHub - zeqianli/tgv: Explore 5,000+ genomes in the terminal. Light, blazing fast πŸš€, vim-motion. Explore 5,000+ genomes in the terminal. Light, blazing fast πŸš€, vim-motion. - zeqianli/tgv

New tgv release: local cache!

tgv download hg38

Download UCSC reference genomes to a local sqlite db for much faster browsing. Awesome Rust tools (twobit, bigtools) made this simple.

github.com/zeqianli/tgv

19.07.2025 23:37 β€” πŸ‘ 30    πŸ” 8    πŸ’¬ 0    πŸ“Œ 0
Logo for the Sandpiper website

Logo for the Sandpiper website

Out in @natbiotech.nature.com: Metagenome taxonomy profilers usually ignore unknown species. SingleM is an accurate profiler which doesn't, even detecting phyla with no MAGs. Profiles of 700,000 metagenomes at sandpiper.qut.edu.au. A 🧡

16.07.2025 21:59 β€” πŸ‘ 126    πŸ” 70    πŸ’¬ 7    πŸ“Œ 9
Video thumbnail

Got this setup working where I can now run .ipynb notebooks right from inside the terminal with a combination of neovim, quarto, kitty, and this neovim plugin called molten: github.com/benlubas/mol...
Never have to abandon my precious vim setup again

11.07.2025 07:29 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Inhibiting heme piracy by pathogenic Escherichia coli using de novo-designed proteins - Nature Communications Many pathogens encode transporters that extract heme directly from host proteins. In this study, the authors demonstrate the utility of de novo-designed proteins in understanding the mechanism behind ...

Excited to share our latest work using AI-designed proteins to block heme-piracy by E. coli. Published in @natcomms.nature.com. A team effort between my lab and the β€ͺβ€ͺ@knottrna.bsky.social‬ ‬lab, with experimental work led by the talented @danielrfox.bsky.social
www.nature.com/articles/s41...

09.07.2025 23:55 β€” πŸ‘ 40    πŸ” 19    πŸ’¬ 1    πŸ“Œ 3

Been eagerly awaiting this one. Amazing work

07.07.2025 08:45 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

We have written up a tutorial on how to run BindCraft, how to prepare your input PDB, how to select hotspots, and various other tips and tricks to get the most out of binder design!

github.com/martinpacesa...

30.06.2025 19:45 β€” πŸ‘ 135    πŸ” 54    πŸ’¬ 3    πŸ“Œ 0
Post image

Some matplotlib work in progress

27.06.2025 07:13 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Video thumbnail

1/27 We have a new paper out! Turns out that snowflake yeast have been hiding a secret from us - they've evolved a (very!) crude circulatory system. Not with blood vessels or a heart, but through spontaneous fluid flows powered by their metabolism. πŸ§ͺπŸ”¬

www.science.org/doi/full/10....

24.06.2025 16:52 β€” πŸ‘ 357    πŸ” 148    πŸ’¬ 14    πŸ“Œ 25
Post image

Closer... I think at this point the solution lies in manually making a list of the hex codes, but that's for another day.

24.06.2025 07:58 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Revisiting this topic now that I've forced myself to use PyMOL. Using this script to install the viridis family of colour palettes: github.com/smsaladi/pym... and running '"spectrum count, palette=magma, MODEL_NAME". The palette is not there still. They must modify the magma palette somehow

23.06.2025 22:55 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

How do we know for sure if we have the best AF prediction? We still need prior knowledge/expectations of what we're trying to predict with AF. And in the absence of that, I guess we can't really know unless we try every possible combination in the search space (which is not feasible).

20.06.2025 22:42 β€” πŸ‘ 1    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

Seen similar things where the PAEs of a multimer are poor until all the subunits in the correct stoichiometry are provided.

This worries me, because one could have a good prediction like "C", but miss out on the best prediction "D".

20.06.2025 22:42 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image Post image Post image Post image

Physics-based design of efficient Kemp eliminases

@lynnkamerlin.bsky.social

www.nature.com/articles/s41...

19.06.2025 21:18 β€” πŸ‘ 21    πŸ” 7    πŸ’¬ 0    πŸ“Œ 0
Post image

Learning some Blender molecular nodes from @sarahjpiper.bsky.social @ccemmp-outreach.bsky.social

13.06.2025 04:03 β€” πŸ‘ 9    πŸ” 2    πŸ’¬ 0    πŸ“Œ 0
Preview
A general substitution matrix for structural phylogenetics. Abstract. Sequence-based maximum likelihood (ML) phylogenetics is a widely used method for inferring evolutionary relationships, which has illuminated the

New paper from the lab from Sriram Garg in my group. We introduce a general substitution matrix for structural phylogenetics. I think this is a big deal, so read on below if you think deep history is important. academic.oup.com/mbe/advance-...

11.06.2025 14:01 β€” πŸ‘ 93    πŸ” 52    πŸ’¬ 3    πŸ“Œ 2
cat sortme.txt
AAA foo
AAA foo
AAA foo
AAA bar
AAA bar
BBB baz
BBB baz
CCC buzz

sort sortme.txt | uniq -c | awk '{print $2,$1,$3}' | tee tmp | cut -f1 -d " " | uniq -c | join -o 1.1,1.2,2.2,2.3 -1 2 -2 1 - tmp | sort -k1,1n -k3,3nr

cat sortme.txt AAA foo AAA foo AAA foo AAA bar AAA bar BBB baz BBB baz CCC buzz sort sortme.txt | uniq -c | awk '{print $2,$1,$3}' | tee tmp | cut -f1 -d " " | uniq -c | join -o 1.1,1.2,2.2,2.3 -1 2 -2 1 - tmp | sort -k1,1n -k3,3nr

Little shell scripting solution to the problem of finding a group A in column1 of a table that is entirely made up of group B in column2, and sorting for the biggest homogeneous grouping.

11.06.2025 01:49 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
home | GlobDB

I'm happy to announce the latest release of the GlobDB, available at globdb.org.

The GlobDB is a database of "species dereplicated" microbial genomes, and as of release 226 contains twice the number of species-representative genomes (306,260) than the latest GTDB release.

10.06.2025 11:20 β€” πŸ‘ 105    πŸ” 60    πŸ’¬ 3    πŸ“Œ 4
Video thumbnail

Proud to present the lab's latest work. The full structure of the sodium translocating methyltransferase (Mtr) bound to a small oxygen-responsive small protein MtrI. Work by Tristan and together with @schmitzstreitslab.bsky.social .

www.biorxiv.org/content/10.1101/2025.06.02.657420v1

04.06.2025 19:10 β€” πŸ‘ 58    πŸ” 22    πŸ’¬ 3    πŸ“Œ 3
Example of the fzf pop-up window for switching virtual environments

Example of the fzf pop-up window for switching virtual environments

function con () {
    mamba deactivate
    mamba activate "$(mamba info --envs | fzf --color 'border:#94e2d5,label:#cdd6f4' --reverse --tmux center,40%,40% | awk '{print $1}')"
}

function con () { mamba deactivate mamba activate "$(mamba info --envs | fzf --color 'border:#94e2d5,label:#cdd6f4' --reverse --tmux center,40%,40% | awk '{print $1}')" }

I'm sick of typing "conda activate ENV" or "conda deactivate" to switch virtual environments. Now I've added a conda/mamba env switcher function that uses fzf to my ~/.zshrc.

03.06.2025 07:12 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Converting GTDB faa files to a DIAMOND database with taxonomy | James Lingford Minimal blog by James Lingford

Quick blogpost on how to convert the Genome Taxonomy Database (GTDB) protein faa reps into a DIAMOND database that includes taxonomic information and genome ID's
www.jameslingford.com/blog/gtdb-to...

31.05.2025 00:27 β€” πŸ‘ 5    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0
Preview
High-affinity PQQ import is widespread in Gram-negative bacteria Diverse bacteria use the high-affinity membrane transport protein PqqU to scavenge the nutrient PQQ from the environment.

Why make a cofactor when you can get it for free?

Our work, led by @fabianmunder.bsky.social, shows that bacteria from 22 phyla use the high-affinity transporter PqqU to obtain the redox cofactor PQQ from the environment as an alternative to cofactor synthesis.

www.science.org/doi/10.1126/...

30.05.2025 19:04 β€” πŸ‘ 17    πŸ” 8    πŸ’¬ 2    πŸ“Œ 0

NIH funding supporting the HMMER and Infernal software projects has been terminated. NIH states that our work, as well as all other federally funded research at Harvard, is of no benefit to the US.

22.05.2025 12:42 β€” πŸ‘ 286    πŸ” 232    πŸ’¬ 37    πŸ“Œ 46

Love this. Do you have any code that you could share to recreate this simulation?

25.05.2025 23:31 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
awk '/^>/ {if (NR==1){print} else {printf "\n%s\n", $0}; next;} {printf "%s", $0}' input_file.faa

awk '/^>/ {if (NR==1){print} else {printf "\n%s\n", $0}; next;} {printf "%s", $0}' input_file.faa

Awk command to unwrap wrapped lines in fasta files. It's just neat!

18.05.2025 11:47 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Complex archaea that bridge the gap between prokaryotes and eukaryotes - Nature This study identifies a clade of archaea that is the immediate sister group of eukaryotes in phylogenetic analyses, and that also has a repertoire of proteins otherwise characteristic of eukaryotesβ€”pr...

On this day, 10 years ago, my lab published the @nature.com paper reporting the discovery of the Asgard archaea (Lokiarchaeota at the time), revealing the archaeal nature of eukaryotic cells, and reshaping the Tree of Life. What a ride it has been since then... www.nature.com/articles/nat...

14.05.2025 13:48 β€” πŸ‘ 129    πŸ” 34    πŸ’¬ 7    πŸ“Œ 1
Preview
US researchers must stand up to protect freedoms, not just funding Curtailment of freedoms and disregard for the rule of law in the United States is destroying the ability of science to serve the nation’s, and the world’s, interests. Researchers can take action.

Very much agree with this call to action. Important for scientists to speak out together against all forms of tyranny, and not just speak out against funding cuts.

Though I realise that's easy for me to say, since I don't live in the U.S.
www.nature.com/articles/d41...

14.05.2025 04:53 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
The energetic cost of cellular traits. Costs were calculated for individual proteins, for macromolecular
complexes, or, in the case of Escherichia coli, for a single cell (shown for comparison). Calmodulin: 148 amino acids; ATP synthase: 4,955 amino acids (E. coli); dynein-2 motor: 12,084 amino acids (human, PDB ID:
6SC2); ribosome: 7,459 amino acids and 4,566 ribonucleotides (E. coli); vesicle: diameter 50 nm [membrane cost including proteins D 1.45  109
ATP mm 2
(86)]; nuclear pore complexes: 50–110 MDa (yeast and
vertebrates) (2, 7); eukaryotic flagellum: 11 mm in length; and E. coli: estimated using the fourth cell budget method. Abbreviation: PDB ID, Protein Data Bank identifier.

The energetic cost of cellular traits. Costs were calculated for individual proteins, for macromolecular complexes, or, in the case of Escherichia coli, for a single cell (shown for comparison). Calmodulin: 148 amino acids; ATP synthase: 4,955 amino acids (E. coli); dynein-2 motor: 12,084 amino acids (human, PDB ID: 6SC2); ribosome: 7,459 amino acids and 4,566 ribonucleotides (E. coli); vesicle: diameter 50 nm [membrane cost including proteins D 1.45 109 ATP mm 2 (86)]; nuclear pore complexes: 50–110 MDa (yeast and vertebrates) (2, 7); eukaryotic flagellum: 11 mm in length; and E. coli: estimated using the fourth cell budget method. Abbreviation: PDB ID, Protein Data Bank identifier.

Bioenergetics and the evolution of cellular traits [review]:
doi.org/10.1146/annu...

10.05.2025 05:05 β€” πŸ‘ 6    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

@jameslingford is following 20 prominent accounts