Ben Langmead's Avatar

Ben Langmead

@benlangmead.bsky.social

Professor of Computer Science @ JHU. https://www.langmead-lab.org/ https://www.youtube.com/BenLangmead

3,424 Followers  |  243 Following  |  98 Posts  |  Joined: 08.09.2023  |  2.2489

Latest posts by benlangmead.bsky.social on Bluesky

Apply - Interfolio {{$ctrl.$state.data.pageTitle}} - Apply - Interfolio

Very excited to share that Department of Biomedical Engineering @hopkinsengineer.bsky.social is hiring for open rank tenure-track faculty in #Biomedical #DataScience!

πŸ”— apply.interfolio.com/177042
Apps starting to be reviewed Dec 5 and ongoing after!

πŸ§ͺ🧬πŸ–₯️🧠 #StatsSky #WomenInSTEM

06.11.2025 06:20 β€” πŸ‘ 26    πŸ” 18    πŸ’¬ 0    πŸ“Œ 0
Post image

We spent a year investigating billionaires for @washingtonpost.com.

We found: the wealthiest 100 Americans gave $1.1 billion to influence the 2024 elections β€” 140x more than they did in 2000. And almost all of that giving boosted Republicans.

washingtonpost.com/politics/int...

21.11.2025 14:56 β€” πŸ‘ 4901    πŸ” 2606    πŸ’¬ 165    πŸ“Œ 311
https://link.springer.com/article/10.1007/s00239-025-10277-1
Conceptual overview of hierarchical orthologous groups. An example of one HOG, or gene family. A Species tree with four taxa: plant (green), fish (blue), human (orange), and mouse (yellow), each with one or more genes. B The implied gene tree, dubbed β€œHOG tree,” and inferred nested HOG composition. Duplication nodes (red) can be deduced based on the species tree topology and clusters of homologous genes at each level. Ancestral genes from which the HOGs descended are shown in gray. C HOGs returned at different taxonomic levels. Consider a gene family that was present in the last eukaryotic common ancestor (LECA). At this level, a single HOG encompasses all genes descending from that ancestral gene. At the Vertebrata level, this gene underwent duplication, leading to two distinct copies, i.e., HOGs. At the Mammalia level, a second duplication further subdivides one of these HOGs, showing how deeper HOGs split into nested subHOGs at more recent levels. The HOG composition implies that a loss event occurred after the mammalian speciation

https://link.springer.com/article/10.1007/s00239-025-10277-1 Conceptual overview of hierarchical orthologous groups. An example of one HOG, or gene family. A Species tree with four taxa: plant (green), fish (blue), human (orange), and mouse (yellow), each with one or more genes. B The implied gene tree, dubbed β€œHOG tree,” and inferred nested HOG composition. Duplication nodes (red) can be deduced based on the species tree topology and clusters of homologous genes at each level. Ancestral genes from which the HOGs descended are shown in gray. C HOGs returned at different taxonomic levels. Consider a gene family that was present in the last eukaryotic common ancestor (LECA). At this level, a single HOG encompasses all genes descending from that ancestral gene. At the Vertebrata level, this gene underwent duplication, leading to two distinct copies, i.e., HOGs. At the Mammalia level, a second duplication further subdivides one of these HOGs, showing how deeper HOGs split into nested subHOGs at more recent levels. The HOG composition implies that a loss event occurred after the mammalian speciation

https://link.springer.com/article/10.1007/s00239-025-10272-6
Summary of the QfO8 meeting. a Hot topics and future directions in method development and applications within the QfO community, namely artificial intelligence, protein domains, protein structure, RNA and splicing isoforms. b Definition of orthology and paralogy, including various paralogous subtypes (e.g. in-paralogs and out-paralogs). c Duplications and functional divergence. d Applications of orthology

https://link.springer.com/article/10.1007/s00239-025-10272-6 Summary of the QfO8 meeting. a Hot topics and future directions in method development and applications within the QfO community, namely artificial intelligence, protein domains, protein structure, RNA and splicing isoforms. b Definition of orthology and paralogy, including various paralogous subtypes (e.g. in-paralogs and out-paralogs). c Duplications and functional divergence. d Applications of orthology

https://link.springer.com/article/10.1007/s00239-025-10271-7
Overview of the OrthoXML File Format (simplified). A schematic representation of an OrthoXML file, a standardized XML-based format for representing orthology data. OrthoXML follows a hierarchical structure where elements are enclosed within opening < tag > and closing </tag > tags. < orthoXML > is the root element enclosing other elements. The < species > element contains information about genes. An OrthoXML file can include a < taxonomy > element, which specifies the species tree used to generate the file. Additionally, the < groups > element encapsulates the orthology and paralogy relationships among genes

https://link.springer.com/article/10.1007/s00239-025-10271-7 Overview of the OrthoXML File Format (simplified). A schematic representation of an OrthoXML file, a standardized XML-based format for representing orthology data. OrthoXML follows a hierarchical structure where elements are enclosed within opening < tag > and closing </tag > tags. < orthoXML > is the root element enclosing other elements. The < species > element contains information about genes. An OrthoXML file can include a < taxonomy > element, which specifies the species tree used to generate the file. Additionally, the < groups > element encapsulates the orthology and paralogy relationships among genes

Our trilogy of orthology publications is online!
Review on Hierarchical Orthologous Groups doi.org/10.1007/s00239-025-10277-1

OrthoXML-Tools doi.org/10.1007/s00239-025-10271-7

A great community effort on Quest for Orthologs in the era of Data Deluge and AI doi.org/10.1007/s00239-025-10272-6

21.11.2025 16:26 β€” πŸ‘ 18    πŸ” 9    πŸ’¬ 1    πŸ“Œ 0
Post image

Connect with the UK’s #bioinformatics and #computationalbiology community at #ISCBUK 2026. Share your latest research, build new collaborations, and be part of this inaugural meeting.

Submissions are open until February 5, 2026.

πŸ“₯ Submit: https://www.iscb.org/uk2026/call-for-submissions/abstracts

19.11.2025 11:16 β€” πŸ‘ 1    πŸ” 2    πŸ’¬ 0    πŸ“Œ 0
Preview
Why we made Affinity free, and how we’ll keep it that way We’ve made Affinity completely free, empowering professional designers with studio-grade creative software, supported by Canva’s sustainable ecosystem.

I'm a longtime fan of Affinity Designer as an affordable Illustrator-killer for figures, and... it's now free?! www.canva.com/newsroom/new...
Highly recommended if you're sick of paying Adobe $. Maybe Canva can buy NPG too and get rid of the OA fees.

18.11.2025 23:28 β€” πŸ‘ 39    πŸ” 11    πŸ’¬ 1    πŸ“Œ 4

The James P. Taylor Foundation for Open Science is pleased to announce the recipients of the 2025 JXTX+CSHL Genome Informatics Scholarships. JXTX provides support for students to attend conferences in CompBio and data science, where they can present their work [1/8]

06.11.2025 15:27 β€” πŸ‘ 4    πŸ” 5    πŸ’¬ 1    πŸ“Œ 0
Post image

Former UVA president Jim Ryan, who resigned over the summer due to pressure from the Trump Administration, just shared this 12-page letter with the Faculty Senate, detailing his experience with the Board of Visitors and DOJ.

It's a surreal--and troubling--read.

drive.google.com/file/d/1Is6x...

14.11.2025 14:15 β€” πŸ‘ 2275    πŸ” 938    πŸ’¬ 70    πŸ“Œ 146
Video thumbnail

A bad thing is unfolding at NIH this week: It looks like the Trump administration is trying to replace key civil servant scientific leaders, the Institute Directors, with political hires. These directors control the NIH budget, tens of billions.

A bit of a video explainer here: 1/ πŸ§ͺ

13.11.2025 22:31 β€” πŸ‘ 698    πŸ” 450    πŸ’¬ 16    πŸ“Œ 37

Huge congrats, Zam!! Amazing!

14.11.2025 15:06 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Many thanks. I really enjoy making them. It was great to meet in person at CSHL!

10.11.2025 18:21 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

‼️ There are an unprecedented number of Institute Director vacancies at the NIH and many of the application windows close this week or next, including ones that are essential to genomics research in the US including NHGRI, NLM/NCBI, and NIGMS. Please spread the word: hr.nih.gov/careers/open...

10.11.2025 16:08 β€” πŸ‘ 7    πŸ” 6    πŸ’¬ 1    πŸ“Œ 0

Really excited to see our new work in scaling Mumemto to any size pangenome published in Genome Research this morning. And right on cue with the great opportunity to present this work at #GI2025 this week.

07.11.2025 21:29 β€” πŸ‘ 16    πŸ” 5    πŸ’¬ 1    πŸ“Œ 0
Post image

#GI2025 Vikram Shivakumar from Ben Langmead's lab (@benlangmead.bsky.social) presents "MumemtoM - partitioned Multi-MUM finding for scalable pangenomics ". Now published in Genome Research @genomeresearch.bsky.social. Read full text here ➑️ tinyurl.com/Genome-Res-2...

07.11.2025 15:08 β€” πŸ‘ 10    πŸ” 4    πŸ’¬ 0    πŸ“Œ 1
Figure 1: (A) Anchor-based merging requires a common sequence (red) present in each partition. Multi-MUMs are merged by identifying overlaps between partition-specific matches in the anchor coordinate space, and a uniqueness threshold determines if a MUM is still unique in each partition after truncation. (B) String-based merging enables compu- tation of multi-MUMs between partitions without a common sequence. An example tree (left) is shown, highlighting the use case where partial multi-MUMs specific to internal nodes (starred) can be computed by merging subclade-based partitions up a tree. (right) MUM overlaps are computed by running Mumemto on the MUM sequences, and the uniqueness threshold array ensures overlaps remain unique across the merged dataset. (C) An example Burrows-Wheeler Transform (BWT), matrix (BWM), and Longest Com- mon Prefix (LCP) array, with sequence IDs for each suffix shown (ID). A non-maximal unique match (UM) is shown, and the uniqueness threshold for this match is found us- ing the flanking LCP values. (D) A partial multi-MUM (in blue) is found in all-but-one sequence (excluded in red). Using two anchor sequences (red and orange), all-but-one partial MUMs can be computed using an augmented anchor-based merging method (sec- tion 2.6).

Figure 1: (A) Anchor-based merging requires a common sequence (red) present in each partition. Multi-MUMs are merged by identifying overlaps between partition-specific matches in the anchor coordinate space, and a uniqueness threshold determines if a MUM is still unique in each partition after truncation. (B) String-based merging enables compu- tation of multi-MUMs between partitions without a common sequence. An example tree (left) is shown, highlighting the use case where partial multi-MUMs specific to internal nodes (starred) can be computed by merging subclade-based partitions up a tree. (right) MUM overlaps are computed by running Mumemto on the MUM sequences, and the uniqueness threshold array ensures overlaps remain unique across the merged dataset. (C) An example Burrows-Wheeler Transform (BWT), matrix (BWM), and Longest Com- mon Prefix (LCP) array, with sequence IDs for each suffix shown (ID). A non-maximal unique match (UM) is shown, and the uniqueness threshold for this match is found us- ing the flanking LCP values. (D) A partial multi-MUM (in blue) is found in all-but-one sequence (excluded in red). Using two anchor sequences (red and orange), all-but-one partial MUMs can be computed using an augmented anchor-based merging method (sec- tion 2.6).

Post image Post image

Fantastic talk by @vikramshivakumar.bsky.social Mumemtoβ€”Scalable multi-MUM finding for pangenomes
Papers biorxiv.org/content/10.1101/2025.05.20.654611 & doi.org/10.1186/s13059-025-03644-0
Code: github.com/vikshiv/mume...
Very efficient pangenome visualization tool, revealing synteny and variations!

06.11.2025 01:13 β€” πŸ‘ 23    πŸ” 12    πŸ’¬ 1    πŸ“Œ 1
Preview
Transcriptomic Analysis of the Human Habenula in Schizophrenia | American Journal of Psychiatry Objective: The objective of this study was to define the molecular neuroanatomy of the human habenula (Hb) and identify transcriptomic differences between brains of individuals with schizophrenia and ...

Proud to announce our paper β€˜Transcriptomic Analysis of the Human Habenula in Schizophrenia’ from @lieberinstitute.bsky.social is the cover article for the November issue of the American Journal of Psychiatry! 🧠 #HabenulaLIBD #snRNAseq #Habenula doi.org/10.1176/appi...

04.11.2025 21:30 β€” πŸ‘ 23    πŸ” 13    πŸ’¬ 1    πŸ“Œ 5
Preview
Johns Hopkins researchers to present at Genome Informatics 2025 Students from the Department of Computer Science will give talks and present posters on their research in genome informatics.

Check out the cool work being presented by our students at Genome Informatics! This year’s event was co-organized by @benlangmead.bsky.social, & features talks from @vikramshivakumar.bsky.social & more, plus posters from @alexsweeten.bsky.social, @maojanlin.bsky.social, & @sinamajidian.bsky.social:

04.11.2025 17:14 β€” πŸ‘ 5    πŸ” 3    πŸ’¬ 0    πŸ“Œ 2
Another lit up house from nighttime stroll

Another lit up house from nighttime stroll

Several lab buildings from nighttime stroll

Several lab buildings from nighttime stroll

03.11.2025 20:49 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Driveway at Cold Spring Harbor Lab, with conference attendees walking to a meal

Driveway at Cold Spring Harbor Lab, with conference attendees walking to a meal

Dolan Hall

Dolan Hall

Pic from a nighttime stroll

Pic from a nighttime stroll

As our beloved Genome Informatics 2025 (#gi2025) approaches, I'm moved to share some photos from past years at CSHL. A couple more photos coming in a reply below...

03.11.2025 20:47 β€” πŸ‘ 20    πŸ” 6    πŸ’¬ 2    πŸ“Œ 1
Steven Tan accepting ACM-BCB 2025 best paper award

Steven Tan accepting ACM-BCB 2025 best paper award

Huge congratulations to Steven Tan, a superb undergraduate junior at JHU CS whose first-author paper "Movi Color: fast and accurate taxonomic classification with the move structure" won best paper at ACM-BCB. Please check it out: www.biorxiv.org/content/10.1.... Congrats, Steven!!

31.10.2025 17:54 β€” πŸ‘ 22    πŸ” 1    πŸ’¬ 0    πŸ“Œ 1

I'm very happy with `sassy grep` now!
If you ever need to inspect what's in your data, please try and let me know how it goes :)

Of course, it's fast: >1GB/s when using multiple threads.

31.10.2025 15:58 β€” πŸ‘ 8    πŸ” 2    πŸ’¬ 0    πŸ“Œ 0
Index zone by BenLangmead

As far as I know, it's been done for Centrifuge only: benlangmead.github.io/aws-indexes/...

31.10.2025 13:27 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Apply - Interfolio {{$ctrl.$state.data.pageTitle}} - Apply - Interfolio

Come work with me! @jhubiostat.bsky.social is hiring for a tenure-track Assistant Professor position πŸ₯³ Apps due Nov 15

πŸ”— apply.interfolio.com/176686

πŸ§ͺπŸ§¬πŸ’»πŸ§  #StatsSky #WomenInSTEM

31.10.2025 06:58 β€” πŸ‘ 17    πŸ” 15    πŸ’¬ 1    πŸ“Œ 0
Index zone by BenLangmead

October 2025 batch of Kraken 2 indexes, including core_nt and many others, available: benlangmead.github.io/aws-indexes/k2

Coming soon to K2: a feature for querying many K2 indexes as though they're a single index. Highly useful if the index you want to query is too big to build and/or fit in RAM.

30.10.2025 18:11 β€” πŸ‘ 22    πŸ” 8    πŸ’¬ 2    πŸ“Œ 0
Preview
Nine new tenure-track faculty join Johns Hopkins Computer Science Their research spans social computing and human-computer interaction to the theoretical foundations and real-world applications of machine learning models.

The Department of Computer Science is pleased to welcome nine new tenure-track faculty to its ranks this academic year! Featuring @anandbhattad.bsky.social, @uthsav.bsky.social, @gligoric.bsky.social, @murat-kocaoglu.bsky.social, @tiziano.bsky.social, and more:

29.10.2025 13:43 β€” πŸ‘ 11    πŸ” 4    πŸ’¬ 0    πŸ“Œ 1

Very excited about Movi 2! Excellent work by Mohsen here. FYI, I have a series of 5 videos on the move structure starting with this one: youtu.be/REniD2dKf6A?...

21.10.2025 21:39 β€” πŸ‘ 18    πŸ” 5    πŸ’¬ 0    πŸ“Œ 0
Preview
GitHub - mohsenzakeri/Movi: Fast, Cache-Efficient, and Scalable Queries on Pangenomes Fast, Cache-Efficient, and Scalable Queries on Pangenomes - mohsenzakeri/Movi

1/6 Movi 2 is here: faster and more space-efficient for pangenome queries. Its fastest mode uses half the memory of Movi 1 while running ~30% faster. github.com/mohsenzakeri...

21.10.2025 20:00 β€” πŸ‘ 44    πŸ” 24    πŸ’¬ 1    πŸ“Œ 2

We're looking for an instructor for the algorithms course in our Bioinformatics MS program. The course assets have already been made (by me) and used in several previous offerings, but we need an instructor! If you're in the DMV area, check it out: www.linkedin.com/posts/robert...

17.10.2025 14:55 β€” πŸ‘ 5    πŸ” 3    πŸ’¬ 0    πŸ“Œ 0

Oooo, that's very nice

15.10.2025 11:29 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

The generated illustrations are SVGs -- you can load them into Illustrator, Inkscape, Designer, etc, and edit away. I tried to put the components of the image into SVG groups so you can treat them as "layers" in those editors (e.g. you can hide bits you don't need).

14.10.2025 20:51 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Illustration of Burrows-Wheeler Transform and many auxiliary structures from the input string how$now$brown$cow$#

Illustration of Burrows-Wheeler Transform and many auxiliary structures from the input string how$now$brown$cow$#

New tool "bwt-svg" for making illustrations of the BWT and the many auxiliary arrays and other structures related to it. Pyodide-based no-installation-necessary interface here: benlangmead.github.io/bwt-svg/. (H/t to @robert.bio for pointing me to pyodide!) Full repo: github.com/benlangmead/....

14.10.2025 20:48 β€” πŸ‘ 40    πŸ” 21    πŸ’¬ 4    πŸ“Œ 1

@benlangmead is following 20 prominent accounts