Igor Martayan's Avatar

Igor Martayan

@imartayan.bsky.social

PhD student in algorithmic bioinformatics at @bonsaiseqbioinfo.bsky.social. Interested in randomized algorithms and space-efficient data structures https://igor.martayan.org

773 Followers  |  332 Following  |  66 Posts  |  Joined: 20.09.2023
Posts Following

Posts by Igor Martayan (@imartayan.bsky.social)

Post image Post image

Deacon can now run in the browser using WebAssembly. Sequence data never leaves your machine. It currently supports FASTA/Q filtering using indexes up to 1GB in size.

Demo: bede.im/deacon

03.03.2026 14:30 β€” πŸ‘ 29    πŸ” 10    πŸ’¬ 1    πŸ“Œ 1
Presentation of scientific work on De Bruijn Graphs applied to the processing of sequencing data in the context of biology. The picture was taken in the conference room of the University of Venice, where a screen displays a slide that introduces De Bruijn Graphs, with the speaker standing in front of it. Being the screen is a large renaissance painting that spans from the floor to the roof.

Presentation of scientific work on De Bruijn Graphs applied to the processing of sequencing data in the context of biology. The picture was taken in the conference room of the University of Venice, where a screen displays a slide that introduces De Bruijn Graphs, with the speaker standing in front of it. Being the screen is a large renaissance painting that spans from the floor to the roof.

I had the occasion of presenting nice results about the detection of biological events in De Bruijn Graph at #DSB2026, in the context of my PhD work on #Vizitig !

Thanks to the organizers and colleagues for this amazing and super-inspiring event (and @camillemrcht.bsky.social for the picture).

20.02.2026 18:34 β€” πŸ‘ 16    πŸ” 7    πŸ’¬ 1    πŸ“Œ 0

kache-hash: A dynamic, concurrent, and cache-efficient hash table for streaming k-mer operations https://www.biorxiv.org/content/10.64898/2026.02.13.705625v1

17.02.2026 05:47 β€” πŸ‘ 10    πŸ” 7    πŸ’¬ 0    πŸ“Œ 0

🚨UPCOMING DEADLINES🚨

RECOMB-CG: 13 February
RECOMB-RSG: 15 February
RECOMB-Privacy: 9 March
RECOMB-Seq: 12 March (abstract registration)
RECOMB-Arch: 12 March (abstract registration)
RECOMB-Genetics: 13 March

#RECOMB2026 #deadlines

05.02.2026 20:41 β€” πŸ‘ 6    πŸ” 5    πŸ’¬ 0    πŸ“Œ 0
Preview
ZOR filters: fast and smaller than fuse filters Probabilistic membership filters support fast approximate membership queries with a controlled false-positive probability $\varepsilon$ and are widely used across storage, analytics, networking, and b...

Preprint alert!
arxiv.org/abs/2602.03525
TLDR:
ZOR filters are STATIC filters with false positives.
-Almost memory optimal: <1% overhead over the theoretical lower bound (!!!)
-Fast queries: ~100 ns
-Construction cannot fail

A thread:

04.02.2026 12:28 β€” πŸ‘ 31    πŸ” 12    πŸ’¬ 1    πŸ“Œ 1

So anyway:
BiRank & QuadRank: single-cache-miss rank queries that are double the throughput of other Rust crates and fully saturate the memory bandwidth.
Side effect: QuadFm is smaller and 2-4x faster than the next-best FM-index.

github.com/RagnarGrootK...

raw.githubusercontent.com/RagnarGrootK...

04.02.2026 01:24 β€” πŸ‘ 18    πŸ” 9    πŸ’¬ 2    πŸ“Œ 0

Cool paper on representing a collection of sets via a spanning tree of their differences. This builds upon work by Bookstein ('91 )! as well as work we did in using this representation to compress color sets in Mantis MST. I think this repr. has many important applications! arxiv.org/pdf/2601.23240

02.02.2026 14:26 β€” πŸ‘ 8    πŸ” 3    πŸ’¬ 1    πŸ“Œ 0

Generating minimum-density minimizers https://www.biorxiv.org/content/10.64898/2026.01.25.701585v1

28.01.2026 10:46 β€” πŸ‘ 9    πŸ” 3    πŸ’¬ 0    πŸ“Œ 1
Preview
Excalidraw β€” Collaborative whiteboarding made easy Excalidraw is a virtual collaborative whiteboard tool that lets you easily sketch diagrams that have a hand-drawn feel to them.

I like excalidraw: excalidraw.com
It can also be self-hosted if you want to

27.01.2026 17:17 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Very excited about this latest work led by @jermp.bsky.social! Since it's initial release, SSHash has served as the basis for several other tools (Fulgor, piscem, etc.). It was already very fast. It is now *substantially* faster!

www.biorxiv.org/content/10.6...

22.01.2026 21:16 β€” πŸ‘ 24    πŸ” 10    πŸ’¬ 1    πŸ“Œ 1
Preview
Mirdita Lab - Laboratory for Computational Biology & Molecular Machine Learning Mirdita Lab builds scalable bioinformatics methods.

My time in @martinsteinegger.bsky.social's group is ending, but I’m staying in Korea to build a lab at Sungkyunkwan University School of Medicine. If you or someone you know is interested in molecular machine learning and open-source bioinformatics, please reach out. I am hiring!
mirdita.org

20.01.2026 11:07 β€” πŸ‘ 105    πŸ” 55    πŸ’¬ 7    πŸ“Œ 1

ahh, CMake, the best of what 1973 had to offer I'm sure

16.01.2026 05:06 β€” πŸ‘ 88    πŸ” 3    πŸ’¬ 5    πŸ“Œ 0

Nice! What lib did you use to generate the interactive plots?

12.01.2026 15:32 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

bsky.app/profile/curi...

09.01.2026 18:58 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

The deadline has been extended to January 9th

30.12.2025 14:33 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Accelign: a GPU-based Library for Accelerating Pairwise Sequence Alignment https://www.biorxiv.org/content/10.64898/2025.12.17.694868v1

20.12.2025 03:50 β€” πŸ‘ 10    πŸ” 3    πŸ’¬ 0    πŸ“Œ 1
DSB 2026 Venice - February 18-19 Workshop Data Structures in Bioinformatics

The 12th edition of the 2-days workshop β€œData Structures in Bioinformatics” (DSB) will take place in Venice (Italy) on February 18-19th, 2026: dsb-meeting.github.io/DSB2026/

10.12.2025 14:29 β€” πŸ‘ 10    πŸ” 9    πŸ’¬ 1    πŸ“Œ 1
Preview
GitHub - bede/deacon: Fast DNA search and [host] depletion using minimizers Fast DNA search and [host] depletion using minimizers - bede/deacon

github.com/bede/deacon
For anyone still using Bowtie2 for filtering or depletion of host sequences or specifics, I can recommend Deacon from @bedec.bsky.social . It is so much faster and easier than Bowtie2, and its performance is equal or better (tested with metagenomes and mitogenomes).🧬 & πŸ–₯️

03.12.2025 19:40 β€” πŸ‘ 25    πŸ” 15    πŸ’¬ 1    πŸ“Œ 1

Preprint alert!

We introduce new ideas to revisit the notion of sampling with window guarantees, also known as minimizers.

A thread:

02.12.2025 11:11 β€” πŸ‘ 15    πŸ” 7    πŸ’¬ 1    πŸ“Œ 2

yes, please!

28.11.2025 16:14 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Cleanifier: Contamination removal from microbial sequences using spaced seeds of a human pangenome index AbstractMotivation. The first step when working with DNA data of human-derived microbiomes is to remove human contamination for two reasons. First, many co

We are excited that our paper "Cleanifier: Contamination removal from microbial sequences using spaced seeds of a human pangenome index" is now published at Bioinformatics (doi.org/10.1093/bioi...).

You can find it at gitlab (gitlab.com/rahmannlab/c...) or install it via PyPI or Bioconda.

27.11.2025 11:27 β€” πŸ‘ 13    πŸ” 9    πŸ’¬ 1    πŸ“Œ 0

Okay, #SeqBim is over, let's get crackin' and speak about our recent preprint (joint work with @imartayan.bsky.social, Lucas Robidou, @camillemrcht.bsky.social and @npmalfoy.bsky.social)

1/

27.11.2025 10:18 β€” πŸ‘ 12    πŸ” 5    πŸ’¬ 1    πŸ“Œ 1
Preview
Handy Handy is a cross platform, open-source, speech-to-text application for your computer

Handy is also a great FOSS alternative supporting both Parakeet and Whisper (and written in Rust!)
handy.computer

20.11.2025 18:20 β€” πŸ‘ 12    πŸ” 2    πŸ’¬ 1    πŸ“Œ 0
Preview
GitHub - COMBINE-lab/mim: A small, auxiliary index to massively improve parallel fastq parsing A small, auxiliary index to massively improve parallel fastq parsing - COMBINE-lab/mim

@wytamma.bsky.social : so, it took a little bit of extra time (not the flight back from the CZI meeting), but I decided to just f#&$ing do it, and the basic code to build and parse with the auxiliary fastq index is working (github.com/COMBINE-lab/...). 1/2

19.11.2025 03:01 β€” πŸ‘ 25    πŸ” 15    πŸ’¬ 3    πŸ“Œ 0

Amazing, congrats!!

15.11.2025 08:31 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Portable Genetic Sequencer Security Vulnerabilities Could Endanger Personal Portable genetic sequencers, particularly those manufactured by Oxford Nanopore Technologies, have revolutionized the field of genomics, making DNA sequencing more accessible and practical across the

Bioinformatics x cybersecurity: Christina Boucher and her colleague Sara Rampazzi uncovered a basic yet critical vulnerability in MinIONs through the MinKNOW software bioengineer.org/portable-gen...

12.11.2025 07:44 β€” πŸ‘ 16    πŸ” 9    πŸ’¬ 2    πŸ“Œ 2
Preview
Beyond Smoothed Analysis: Analyzing the Simplex Method by the Book Narrowing the gap between theory and practice is a longstanding goal of the algorithm analysis community. To further progress our understanding of how algorithms work in practice, we propose a new alg...

The simplex algorithm is super efficient. 80 years of experience says it runs in linear time. Nobody can explain _why_ it is so fast.

We invented a new algorithm analysis framework to find out.

27.10.2025 01:43 β€” πŸ‘ 212    πŸ” 49    πŸ’¬ 5    πŸ“Œ 13

Really exciting that the preprint on Barbell, a new demultiplexer, is finally out!
It's the first tool that builds on Sassy, the approximate-DNA-searching tool that @rickbitloo.bsky.social and myself developed earlier this year, specifically with this application in mind.

23.10.2025 21:28 β€” πŸ‘ 20    πŸ” 15    πŸ’¬ 2    πŸ“Œ 0
Preview
GitHub - mohsenzakeri/Movi: Fast, Cache-Efficient, and Scalable Queries on Pangenomes Fast, Cache-Efficient, and Scalable Queries on Pangenomes - mohsenzakeri/Movi

1/6 Movi 2 is here: faster and more space-efficient for pangenome queries. Its fastest mode uses half the memory of Movi 1 while running ~30% faster. github.com/mohsenzakeri...

21.10.2025 20:00 β€” πŸ‘ 44    πŸ” 24    πŸ’¬ 1    πŸ“Œ 2
Post image

Movi 2: Fast and Space-Efficient Queries on Pangenomes. #Pangenomes #SequenceQueries #Genomics #Bioinformatics @biorxiv-genomic.bsky.social 🧬 πŸ–₯️
www.biorxiv.org/content/10.1...

21.10.2025 13:49 β€” πŸ‘ 6    πŸ” 5    πŸ’¬ 0    πŸ“Œ 0