James Bonfield's Avatar

James Bonfield

@jbonfield.bsky.social

Walker, archer, and volunteer woodland warden by weekend, and bioinformatics software engineer and general geek by weekday. My favourite prime is 15551, my favourite colour is, obviously, octarine, and I love nothing more than being immersed in nature.

792 Followers  |  128 Following  |  129 Posts  |  Joined: 06.11.2023  |  2.3328

Latest posts by jbonfield.bsky.social on Bluesky

"OpenZL is our answer to the tension between the performance of format-specific compressors and the maintenance simplicity of a single executable binary."
engineering.fb.com/2025/10/06/d...

06.10.2025 20:58 β€” πŸ‘ 13    πŸ” 5    πŸ’¬ 2    πŸ“Œ 0
Preview
A complete diploid human genome benchmark for personalized genomics Human genome resequencing typically involves mapping reads to a reference genome to call variants; however, this approach suffers from both technical and reference biases, leaving many duplicated and ...

Delighted to finally announce a preprint describing the Q100 project! β€œA complete diploid human genome benchmark for personalized genomics” For which we finished HG002 to near-perfect accuracy: www.biorxiv.org/content/10.1... 🧡[1/14]

22.09.2025 17:01 β€” πŸ‘ 96    πŸ” 57    πŸ’¬ 4    πŸ“Œ 4

Note: OLD POST! (2023), but I just noticed it.

While it's nice to see comparisons, why compare an (at the time) 2 year old GATK against a 5 year old bcftools?

Since then both have come on a lot. It'd be interesting to see new independent comparisons. (Neither can hold up to deepvariant now.)

18.09.2025 20:58 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

I noted in their presentation they said that samtools mpileup didn't work. I think they're a bit out of date. Bcftools mpileup --poly-mqual can handle the qualities in homopolymers, plus other newer -X profiles.

I haven't tuned it yet though for SBX, but think it'll be OK in general. (To try!)

18.09.2025 20:47 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
By creating an Account with Academia.edu, you grant us a worldwide, irrevocable, non-exclusive, transferable license, permission, and consent for Academia.edu to use your Member Content and your personal information (including, but not limited to, your name, voice, signature, photograph, likeness, city, institutional affiliations, citations, mentions, publications, and areas of interest) in any manner, including for the purpose of advertising, selling, or soliciting the use or purchase of Academia.edu's Services.

By creating an Account with Academia.edu, you grant us a worldwide, irrevocable, non-exclusive, transferable license, permission, and consent for Academia.edu to use your Member Content and your personal information (including, but not limited to, your name, voice, signature, photograph, likeness, city, institutional affiliations, citations, mentions, publications, and areas of interest) in any manner, including for the purpose of advertising, selling, or soliciting the use or purchase of Academia.edu's Services.

I'm sorry, worldwide, irrevocable, non-exclusive, transferable permission to my voice and likeness? For what now? In any manner for any purpose???

This is in academia/.edu's new ToS, which you're prompted to agree to on login. Anyway I'll be jumping ship. You can find my stuff at hcommons.org.

17.09.2025 17:16 β€” πŸ‘ 1688    πŸ” 866    πŸ’¬ 59    πŸ“Œ 174

Instagram is like facebork but even more annoying. I looked and I can't even find the equivalent post for you over there. It's just a hateful platform. Probably OK for doom scrolling on a phone, but that's about it.

16.09.2025 08:01 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Samtools Samtools

We ought to update htslib.org with more precise recipes, especially for things like conda where we know A) people make mistakes, often and B) it's used A LOT. We may be able to point to something like biocontainers too (or roll our own, but I'd rather not).

It's rarely built from source it seems.

15.09.2025 13:58 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Even more prolific is looking at their WhatsApp number from the minimap2 fake site, and associated email. So so many fake sites. Scary

(See 447950904740 phone number, and emmawatsofficial54 partial email search results).

15.09.2025 09:06 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Google Search

A google for the support phone number shows how many other phishing sites they have.

www.google.com/search?clien...

Most likely their "support" offering involves getting you to install some trojan.

15.09.2025 08:53 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
Phishing site : minimap2.com Β· Issue #1316 Β· lh3/minimap2 Not sure how to label this one, but I have come across a website minimap2.com which appears to be AI generated but is serving it's own copy of the Github repository. If you search the address or em...

minimap2.com is potentially a phishing site. Please don't use anything from that website.
github.com/lh3/minimap2...

09.09.2025 15:39 β€” πŸ‘ 26    πŸ” 27    πŸ’¬ 1    πŸ“Œ 2

Heads up: ignore samtools dot org, similarly minimap2 dot com and likely others. It's owned by a known phishing site and while the binaries they offer look valid currently (but note they may be serving us different binaries to others), that could change.

Ie: it's not us (Samtools team)! Be warned

15.09.2025 08:40 β€” πŸ‘ 141    πŸ” 126    πŸ’¬ 2    πŸ“Œ 4

Nothing like cold hard data. It's almost as if Brexit was a pack of lies? Who'd have believed it. ;-)

Of course the people that need to see this obviously won't as it'll be deemed "fake news". I really don't know how to fix that one.

14.09.2025 13:19 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
The sun sets behind some trees over a meadow.

The sun sets behind some trees over a meadow.

What's one way you can reconnect with nature this month?

Fresh out of ideas?
- Take a walk without your phone: notice 5 things around you
- Go on a picnic in a public park
- Learn the name of one local bird and see how often you can spot it
- Pick up some litter

#nature #rewild2gether #share

05.09.2025 21:28 β€” πŸ‘ 138    πŸ” 30    πŸ’¬ 6    πŸ“Œ 2
Video thumbnail

Nigel Farage looks uncomfortable as Jamie Raskin uses his opening statement to absolutely demolish him

03.09.2025 16:39 β€” πŸ‘ 21299    πŸ” 6880    πŸ’¬ 1331    πŸ“Œ 1421

The binary version changes are probably the biggest issue, with (IIRC) BCF 4.2 not being readable by bcftools and BCF 4.3 not being readable by GATK, as the minor version bump was a breaking change that made them incompatible.

I think it was necessary as some data was broken, but :-( :-( :(

08.08.2025 20:37 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
GitHub - jkbonfield/htslib at bgzf2 C library for high-throughput sequencing data formats - GitHub - jkbonfield/htslib at bgzf2

FWIW if I ever get time to finish my bgzf2 (zstd) branch (github.com/jkbonfield/h...) it really shines with multi-sample VCF.

The line lengths are just too big for bgzf to do remotely well due to the 32Kb deflate window size.

08.08.2025 20:29 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

BCF gets some things right, but it made many of the same mistakes that BAM did (being of the same era). It's too serial rather than block based, harming any sort of efficient processing and compression. In short, it's the binarisation of the text format that makes it poor.

08.08.2025 20:28 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

A text format we can hack and play with allows for fast experimentation, but it shouldn't be the primary format. Not should we have binary guys which are essentially memory dumps from parsing the text. That's partially what killed BCF from adoption. All the pain with minimal gain!

07.08.2025 06:29 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 2    πŸ“Œ 0
Preview
alphaguess: a word game Guess the word of the day. Guesses reveal where the word is positioned alphabetically. Everyone plays the same word each day.

I've not tried this before. Thanks

🧩 Puzzle #735

πŸ€” 22 guesses

⏱️ 6m 43s

πŸ”— alphaguess.com

29.07.2025 19:40 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Although he does also suitably demonstrate the total lack of PPE needed when scything. That makes it *so* much nicer (along with the noise) than strimmers.

Well, provided you're not scything nettles or brambles as then shoes / shin covering is handy!

24.07.2025 11:12 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Lol, I'm hiding my marriage from myself too apparently!

(Sorry dear. I didn't mean to be ghosting you these last few decades.)

24.07.2025 11:10 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Answering my own question following a lunchtime walk with Iain from @wildlifebcn.org , yes... They start off black.

Learn something new every day πŸ™‚

02.07.2025 19:32 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
Accelerating k-mer-based sequence filtering The exponential growth of global sequencing data repositories presents both analytical challenges and opportunities. While k - mer-based indexing has improved scalability over traditional alignment fo...

Preprint alert!
We present K2Rmini, an ultra-fast, grep-like tool that extracts sequences of interest from FASTA/FASTQ files based on their k-mer content.
www.biorxiv.org/content/10.1...
A thread

02.07.2025 12:59 β€” πŸ‘ 37    πŸ” 19    πŸ’¬ 1    πŸ“Œ 0
A black common lizard

A black common lizard

Are juvenile common lizards normally black, or is this a melanistic one? It was tiny. Seen at #RSPB #Fowlmere.

27.06.2025 18:10 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
PΓ€rlor
YouTube video by Kent - Topic PΓ€rlor

There's also Kent - a Swedish group with a distinctly more pop feel. I bought both Agricantus and Kent albums while on work trips away. I liked to pick up random albums local(ish) to the area I'm in. About time I played them again (car has a CD player still).

www.youtube.com/watch?v=1A10...

02.06.2025 22:26 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Com' u ventu
YouTube video by mauriz642003 Com' u ventu

Not Faroese, but that reminds me of some of Agricantus's music. I'm not sure they go in for "bangers", but perhaps www.youtube.com/watch?v=xbLx... is the closest I could find. I think it has a mix of African and south European languages.

02.06.2025 22:12 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Samtools Samtools

Release 1.22 of HTSlib, SAMtools, and BCFtools is now available from GitHub. See htslib.org/download/ for links to tarballs and release notes. πŸ§ͺ

#samtools #bcftools #htslib #bioinformatics

30.05.2025 10:22 β€” πŸ‘ 9    πŸ” 4    πŸ’¬ 0    πŸ“Œ 0
Post image

πŸ“’ HPRC Release 2 is here!

Now with phased genomes from 200+ individuals, a 5x increase from Release 1.

Explore sequencing data, assemblies, annotations & alignments in our interactive data explorer ⬇️:

humanpangenome.org/hprc-data-re...

12.05.2025 13:14 β€” πŸ‘ 36    πŸ” 27    πŸ’¬ 0    πŸ“Œ 3

Good news. I keep pushing for another htslib release still...

12.05.2025 05:32 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Could definitely start with lynx. Muntjac are a bigger problem to control as they don't herd and are therefore harder to track. They breed all year too I think. (Also not native)

05.05.2025 10:05 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

@jbonfield is following 20 prominent accounts