Pierre Marijon's Avatar

Pierre Marijon

@pierre-marijon.bsky.social

Research engineer, #bioinformatics, #genomic, #assembly, #variantcalling he|him #BiInSci #dyslexic #disabled https://pierre.marijon.fr/link.html

102 Followers  |  218 Following  |  17 Posts  |  Joined: 27.11.2024  |  2.0196

Latest posts by pierre-marijon.bsky.social on Bluesky

Preview
GitHub - rust-bio/rust-bio: This library provides implementations of many algorithms and data structures that are useful for bioinformatics. All provided implementations are rigorously tested via cont... This library provides implementations of many algorithms and data structures that are useful for bioinformatics. All provided implementations are rigorously tested via continuous integration. - rus...

#rustbio 3.0 is released, providing BEDPE support and an improved API for Myers bitparallel pattern matching. github.com/rust-bio/rus...

19.09.2025 13:00 β€” πŸ‘ 6    πŸ” 2    πŸ’¬ 0    πŸ“Œ 0

I have the impression that I'm the only one who wanted to load multiple data tracks in igv with one bedgraph.

I discovered it wasn't possible.

It seems easy to me to use each column of the bedgraph to make an igv track but as usual in this kind of case I must miss something.

#bioinformatics

17.07.2025 14:45 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
bell-curve meme with on the left and right: bioinformatics is just reading and writing text files, and in the middle: it's so much more: experiment design, ...

bell-curve meme with on the left and right: bioinformatics is just reading and writing text files, and in the middle: it's so much more: experiment design, ...

I used to be a proto-mover.
Then, I read and wrote text files.

Wondering what's next.

www.reddit.com/r/bioinforma...

15.05.2025 14:03 β€” πŸ‘ 4    πŸ” 1    πŸ’¬ 2    πŸ“Œ 0

Variant normalisation are left align, HGVS Nomenclature are 3' align.

CRY IN BIOINFORMATICS !

18.04.2025 09:04 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
a graph of RNA molecule with colors showing different exons

a graph of RNA molecule with colors showing different exons

Playing with #vizitig again to prepare for the 1.0 release. This is the SNAP25 human gene expressed in 3 SRA datasets linked to cancers.
Exons have one color each, and exon junctions are in yellow.
The rest is new/noise.

26.03.2025 20:33 β€” πŸ‘ 19    πŸ” 8    πŸ’¬ 2    πŸ“Œ 0

And Yes definitively don't use default hasher unless you realy need hash DDOS safety.

25.03.2025 12:34 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
crates.io: Rust Package Registry

I slowly move closer and whisper:
- niffler

crates.io/crates/niffler

25.03.2025 12:33 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
GitHub - natir/biommap: A vcf parser that use memory mapping to get high performance. A vcf parser that use memory mapping to get high performance. - natir/biommap

Maybe I could try to improve this part with some limitation (no compression, no multiline fasta) by use memmap.

Similar to github.com/natir/biommap

19.03.2025 08:19 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
GitHub - natir/pcon: Prompt COuNter Prompt COuNter. Contribute to natir/pcon development by creating an account on GitHub.

If you accepte to have a huge memory usage, only odd k and canonical kmer a code similar to pcon could be nice.

github.com/natir/pcon

19.03.2025 08:13 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Maybe I miss it but your code didn't manage forward and reverse kmer ?

smidminimizer did this job ?

19.03.2025 08:08 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
Tracking Issue for profile-rustflags Β· Issue #10271 Β· rust-lang/cargo Summary Original issue: #7878 Implementation: #10217 Documentation: https://doc.rust-lang.org/nightly/cargo/reference/unstable.html#profile-rustflags-option The profile-rustflags feature allows set...

Some body seems work on it github.com/rust-lang/ca...

28.02.2025 12:23 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

A new release of niffler thank to
@luizirber.bsky.social
to say we must create a crate from my piece of code 6 year ago !

#rustlang

12.02.2025 17:44 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 0    πŸ“Œ 1

πŸ‘‹πŸ»πŸ’»πŸ§¬

10.02.2025 14:40 β€” πŸ‘ 5    πŸ” 6    πŸ’¬ 0    πŸ“Œ 1

Sometimes I spent 1 hour on my sunday evening to made clever change (const generic rust magic) and when I proof read my own PR, my reaction is:

It's clever, it's pretty, but it doesn't improve perf and it can create many problems. In fact, if I reviewed this PR, I'd probably reject it..

09.02.2025 21:55 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
A plot that show rust_bio (slowest), cgranges, clairiere, iitii, coitrees and superinterval (fastest) methodes in function of input size. Curve are almost linear.

A plot that show rust_bio (slowest), cgranges, clairiere, iitii, coitrees and superinterval (fastest) methodes in function of input size. Curve are almost linear.

I think I can say if you want identify which genomic annotation overlap a variant superinterval are fast I didn't realy understand how but it's fast #rustlang #bioinformatics

(source code of benchmark:Β github.com/natir/clairi...)

23.01.2025 14:22 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
GitHub - natir/clairiere: A rust implementation of implicit interval tree with interpolation index. A rust implementation of implicit interval tree with interpolation index. - natir/clairiere

For me yes it's seems to be an augmented tree.

Large query, mean longer intervals.

If you are interest, before I discover COITree I rewrite cgranges and Interpolate Index cgranges in rust github.com/natir/clairi...

10.01.2025 16:10 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

The set of intervals in which we search is effectively static.

Generally the size of the queries are between 1 to 50 many small queries and sorted.
But it would be usefull to be able to query with much larger queries, 50M, even if it would cost much more there are very few of them.

10.01.2025 15:50 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Ok great I wasn't crazy seeing similarities!

From my understanding it doesn't look more like a binary search tree with a pair of position as node. (cgranges article is very good).

If we're interested on task of annotating a set of variants for a set of genome annotations (gene position).

10.01.2025 15:50 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
GitHub - dcjones/coitrees: A very fast interval tree data structure A very fast interval tree data structure. Contribute to dcjones/coitrees development by creating an account on GitHub.

Huge work! I drift on one of my current interests, these works could be applied to research in an interval tree? Found genomic annotations that share an intersection with variant.

COITrees (github.com/dcjones/coit...) used cache-aware binary tree search that I thought of when reading your post.

10.01.2025 14:23 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
An excel table containing variants, the columns are:
chrom, pos, ref, alt, pid_crc, gt_0, gt_1, gt_2, FS, SOR, QD.

Some gt values appear to have been converted to January 1st dates.

An excel table containing variants, the columns are: chrom, pos, ref, alt, pid_crc, gt_0, gt_1, gt_2, FS, SOR, QD. Some gt values appear to have been converted to January 1st dates.

It seems that one of my colleague's excel is still having a problem with the dates. This is the first time I've seen a genotype to date conversion.

03.01.2025 09:26 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

I say almost same thing each time I teach Burrows–Wheeler Transform.

16.12.2024 13:54 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

@pierre-marijon is following 19 prominent accounts