Hasindu Gamaarachchi's Avatar

Hasindu Gamaarachchi

@hasindu2008.bsky.social

Lecturer at UNSW Sydney; Visiting Scientist at Garvan Institute of Medical Research - Designing embedded systems for bioinformatics applications.

119 Followers  |  271 Following  |  39 Posts  |  Joined: 20.11.2024  |  2.1688

Latest posts by hasindu2008.bsky.social on Bluesky

โ˜บ๏ธ

03.08.2025 09:04 โ€” ๐Ÿ‘ 2    ๐Ÿ” 1    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Post image

Minimod preprint by @sunethsa.bsky.social is out
biorxiv.org/content/10.1...
-similar accuracy to modkit & pb-CpG-tools.
-standard open-source licenses (NOT vendor-specific)
-Simple but faster, on a laptop ~4X for DNA and ~55X for RNA.
Code: github.com/warp9seq/min...

23.07.2025 09:27 โ€” ๐Ÿ‘ 3    ๐Ÿ” 1    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

@bonson-wong.bsky.social

21.07.2025 09:05 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Preview
GitHub - BonsonW/slorado: A simplified version of Dorado built on top of S/BLOW5 format. A simplified version of Dorado built on top of S/BLOW5 format. - BonsonW/slorado

If you are at #ISMB2025:
@bosc.bsky.social track around 2:30pm ish after
@sunethsa.bsky.social's talk, Bonson Wong will present on
nanopore basecalling on AMD GPUs using slorado
github.com/BonsonW/slor...

21.07.2025 09:03 โ€” ๐Ÿ‘ 4    ๐Ÿ” 3    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Preview
Realfreq: real-time base modification analysis for nanopore sequencing AbstractSummary. Nanopore sequencers allow sequencing data to be accessed in real-time. This allows live analysis to be performed, while the sequencing is

If you are at #ISMB2025: Go to the
@bosc.bsky.social around 2:30pm ish where
@sunethsa.bsky.social will present real-time @nanoporetech.com frequency calculation using realfreq & standalone frequency calculation using minimod.
academic.oup.com/bioinformati...

21.07.2025 08:58 โ€” ๐Ÿ‘ 2    ๐Ÿ” 1    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

For reference based frequency finding, thought taking the bases that only match the ref could be a better choice. But yes, such a warning is indeed something that would be valuable.

Thank you very much for the suggestion

18.07.2025 11:43 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Post image

I get these then,
6754
6751
6769
6760
6756
Which seem to match the expected, assuming you are using 1-based coordinates

18.07.2025 06:44 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

pass the option --insertions, so that the CIGAR is parsed for inserted bases

18.07.2025 05:50 โ€” ๐Ÿ‘ 2    ๐Ÿ” 1    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 0

human chrM?
echo -e "@SQ\tSN:chrM\tLN:16569" > a.sam
echo -e "read1\t16\tchrM\t6749\t60\t4S12M1I3M2D7M1S\t*\t0\t0\tGGCTCATTAATCTCAATAACAGCCGTAA\t*\tMM:Z:A+a.,0,0,0,0,0;\tML:B:C,255,255,255,255,255" >> a.sam
samtools sort a.sam -o a.bam && samtools index a.bam
./minimod view -c a[A] hg38.fa a.bam

17.07.2025 04:00 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Preview
GitHub - warp9seq/minimod: A bioinformatics tool for viewing and calculating base modification frequencies from BAM files A bioinformatics tool for viewing and calculating base modification frequencies from BAM files - warp9seq/minimod

We've been developing a small standalone tool for viewing & calculating frequency from modification tags in BAM files. This call is for brave users to test.
github.com/warp9seq/min...

written by
@sunethsa.bsky.social
in C, based on mod tag parsing we did for realfreq doi.org/10.1093/bioi...

16.07.2025 06:15 โ€” ๐Ÿ‘ 2    ๐Ÿ” 2    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

The truly open solution is the technicallu better one here (SLOW5). Even if it was not, there would be strong reasons to prefer it. I hope the community rejects closed or strangely licensed basic tools, not just POD5, but also pseudo-open offerings like CellRanger. Good alternatives exist!

04.07.2025 21:08 โ€” ๐Ÿ‘ 9    ๐Ÿ” 6    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 0
Preview
Release blue-crab v0.4.0 ยท Psy-Fer/blue-crab What's Changed New paused end reason and test updates Full Changelog: v0.3.0...v0.4.0

blue-crab v0.4.0 has been released

- yet another end_reason added to support pod5 updates.

To convert POD5<=>S/BLOW5 it's as simple as
pip install blue-crab

pod5->blow5
blue-crab p2s example.pod5 -o example.blow5

blow5->pod5
blue-crab s2p example.blow5 -o example.pod5

github.com/Psy-Fer/blue...

10.07.2025 05:57 โ€” ๐Ÿ‘ 2    ๐Ÿ” 1    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Sorry haha
Tagged the wrong person ๐Ÿ˜
Meant to be @psy-fer.bsky.social

07.07.2025 00:57 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

Will give a try again this time. Our previous attempts to get some examples or more information were not too successful :/

06.07.2025 09:09 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

Will have a look and see, thanks

06.07.2025 09:03 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

But thanks for the suggestion. Worth checking this out nevertheless. I am not familiar wth Python, so when @psyche.social is back, something to try.

06.07.2025 08:57 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

When I last checked with ONT, they said that pod5 writing is done in high performance C++. So we thought adding a pod5 python benchmark in python could be seen as a deliberate attempt to look it slow - so never pursued.

06.07.2025 08:55 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

The specification is here for S/BLOW5 is that is what's after
hasindu2008.github.io/slow5specs/s...

The point is that S/BLOW5 uses primitive ASCII or binary without an intermediate format, where POD5 uses the intermediate Apache IPC (it is a format with a spec, not a standard I think).

06.07.2025 08:45 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Conversion can be done live, so this time is hidden.

06.07.2025 08:38 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

How many flowcells is this GargantION going to have?

06.07.2025 08:35 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

Bulk sequencing file is a bulk-fast5?

That's Python library unfortunately, something close as C is needed for file a proper format benchmark - otherwise end up comparing 2 languages.

I tried to get some info on pod5 writing from ONT devs, but the responses were not so helpful.

06.07.2025 08:34 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 0

Possibly to keep things under the company's control? I am not sure either why.

06.07.2025 04:23 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Now sure I am understanding you correctly, Ascii at least is a iso standard isn't it?

06.07.2025 04:17 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

Pod5 reading code from ONT is available from Dorado in c/c++, if you are aware of where to find the minknows pod5 writing code in c/c++ let us know.

06.07.2025 04:06 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

The sad thing is the community also seem to go with the path of least immediate resistance. Unfortunately, i also don't know how to navigate these non technical issues.

06.07.2025 03:09 โ€” ๐Ÿ‘ 2    ๐Ÿ” 1    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

Also I know quite a few projects that wanted to use slow5, but ONT has influenced them not to use it.
They even influence some projects from sharing data in both formats, like pod5 only it should be.

06.07.2025 03:06 โ€” ๐Ÿ‘ 3    ๐Ÿ” 1    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

I will have to write at least 100 replies to tell just the ones that I still remember, but latest is that ONT want us not to provide data to others in non-pod5 formats, stating baseless reasons.

06.07.2025 03:01 โ€” ๐Ÿ‘ 3    ๐Ÿ” 1    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

This shows the technicality (previously known) that using mmap for reading large files with predictable access pattern which the programmer can well know in advance than what the operating system can guess -> can lead to unpredicable performance outcomes.

05.07.2025 04:26 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Key observations
- sequential access: BLOW5 ~7X faster on academic HPC we tested. Similar performance on desktops with single SSD drives.
- random access: BLOW5 is always significantly fast (sometimes 100X)
- size: similar if same compression
- Dependencies: BLOW5 ~3, POD5 >50

05.07.2025 04:26 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image Post image Post image

For many of those who were asking on BLOW5 vs POD5 for nanopore signal data, here is a finally detailed benchmark we did:
biorxiv.org/content/10.1...
Summary: performance of BLOW5 is >= POD5 (from ~= to 100X, see below), with benefit of having ~3 dependencies instead of >50.

05.07.2025 04:26 โ€” ๐Ÿ‘ 14    ๐Ÿ” 9    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

@hasindu2008 is following 20 prominent accounts