Thanks for your interest! Unfortunately, we don't have such an online class, and it'll actually be in Czech.
04.10.2025 08:25 β π 0 π 0 π¬ 0 π 0@pavelvesely.bsky.social
Computer scientist at Charles University, Prague π¨πΏ I like all kinds of efficient algorithms and data structures for large datasets || also β°οΈπΊπ¦https://iuuk.mff.cuni.cz/~vesely/
Thanks for your interest! Unfortunately, we don't have such an online class, and it'll actually be in Czech.
04.10.2025 08:25 β π 0 π 0 π¬ 0 π 0I've already given such a talk last year, starting with some motivation and then talking about cardinality estimation (hiding many details). About 50 students attending, some interacting at the beginning. Slides in Czech (hopefully easy to translate nowadays):
docs.google.com/presentation...
During the next two months, I will have two long talks about streaming algorithms / data sketching for high-school students. Did you give a similar talk? What was your experience?
02.10.2025 13:23 β π 1 π 0 π¬ 1 π 0Btw, Nick's profile: @nickmatsakis.bsky.social
16.09.2025 21:05 β π 0 π 0 π¬ 0 π 0The algorithm gives more than just an estimate for the diameter: Using the stored points, one can sqrt(2)+epsilon-approximate Furthest Neighbor queries or 1.22-approximate the Minimum Enclosing Ball. The approx. ratio for diameter is optimal but it's still open for MEB. Lots of nice open problems!
16.09.2025 21:04 β π 0 π 0 π¬ 0 π 0Specifically, for sqrt(2)+epsilon-approximation of diameter, the AS'10 algorithm stores O(1/epsilon^3 log(1/epsilon)) input points, and we managed to shave off one factor of 1/epsilon. Still, we can only prove a lower bound of 1/epsilon, and closing the gap is a nice open problem!
16.09.2025 21:04 β π 0 π 0 π¬ 1 π 0The AS'10 algorithm covers points by blurred balls, and this approach overall works. Adding a few ideas, we have circumvented the issue in AS'10, slightly simplified the algorithm and its analysis, and improved the space bounds.
16.09.2025 21:04 β π 0 π 0 π¬ 1 π 0We in fact simplify a nice paper by Agarwal and Sharathkumar from SODA'10 and Algorithmica '15. Yet, despite that it's published in a decent journal, there appears to be a subtle flaw in the argument, and fixing it probably requires using more space...
link.springer.com/article/10.1...
Tomorrow at ESA: my former postdoc Nick Matsakis will present our streaming algorithm for diameter in high-dimensional spaces. Very simple: just 4 lines of pseudocode, and yet, achieving optimal approximation. Joint work with MagnΓΊs M. HalldΓ³rsson. arxiv.org/abs/2505.16720
16.09.2025 20:44 β π 3 π 0 π¬ 2 π 0Pythagorean Triple Square Day, as one man affectionately calls 9/16/25, is a day like no other this century.
16.09.2025 11:50 β π 1203 π 640 π¬ 23 π 111Zstandard's --long range mode works wonders for assemblies, but needs uninterrupted single line sequences.
*AllTheBacteria 661k, multiline fasta*
gzip (pigz): 751GB
zstandard --long: 641GB (30% original size)
*Single line fasta*
gzip (pigz): 700GB
zstandard --long: 232GB (10% original size)
ππ©βπ¬ For 15+ years biology has accumulated petabytes (million gigabytes) ofπ§¬DNA sequencing data𧬠from the far reaches of our planet.π¦ ππ΅
Logan now democratizes efficient access to the worldβs most comprehensive genetics dataset. Free and open.
doi.org/10.1101/2024...
At scale, the way that we store (and process) data matters! Many may think that the way we keep data, the file formats we adopt, and the way that we compress data are unimportant details, but they are, in fact, critical considerations to allow science to move forward at scale!
26.08.2025 14:45 β π 23 π 8 π¬ 1 π 0Springer publishes a P β NP "proof" and Eric Allender has words to say.
blog.computationalco...
A monumental collaborative effort with many incredible people βΊοΈ Proud to be part of this!
10.06.2025 08:21 β π 9 π 6 π¬ 0 π 0Slides from my talk (with @kamilsjaron.bsky.social) on an history of k-mers in bioinformatics: rayan.chikhi.name/pdf/2025-kme...
03.06.2025 09:25 β π 44 π 24 π¬ 1 π 2Yes! Even a full hard drive with family photos is apparently a useful computational resource.
30.05.2025 11:42 β π 3 π 0 π¬ 0 π 0Nicely written blog post by David Eppstein on the BoyerβMoore (deterministic) streaming algorithm to find a majority element in a stream, and its extensions, first to the turnstile model, and then to frequency estimation (MisraβGries).
11011110.github.io/blog/2025/05... via @theory.report
Thanks a lot for organizing the workshop! I really enjoyed it!
26.04.2025 15:03 β π 0 π 0 π¬ 0 π 0We finally concluded the meeting. Thanks to all attendees for their scientific contributions and for traveling (near or far) to the meeting! Thanks to the local organizers for the infrastructure and catering, and thanks to the co-organizers @yaronorenstein.bsky.social @camillemrcht.bsky.social!
25.04.2025 08:18 β π 26 π 11 π¬ 1 π 0@pavelvesely.bsky.social (CSI) on the mother of spss: masked superstrings that help you representing k-mer sets in a very compact way. He actually takes a lot from @brinda.eu since he squeezed 3 papers in a 12 min talk π³
24.04.2025 05:04 β π 5 π 2 π¬ 1 π 0π Just 2 days to go!
Excitement is building for #RECOMBseq 2025 in Seoul π
Join leading researchers as we dive into cutting-edge computational genomics, from single-cell to long-read sequencing.
π April 24-25
π Seoul
π Program: recomb-seq.github.io/program/
#RECOMB2025 #Genomics #Bioinformatics
A decade ago, we had thousands of bacterial genomes. Now, we have millions. How to scale computational methods?
Our paper in @naturemethods.bsky.social answers this: use evolutionary history to guide compression and search.
rdcu.be/eg4OA
w/ @baym.lol, @zaminiqbal.bsky.social et al. π§΅1/
So glad this is finally out. The method has been instrumental in allowing us to compress the AllTheBacteria data - ~2 million bacterial genomes shrink from 3Terabytes (gzipped) to 100Gb using phylogenetic compression. Great work by @brinda.eu
09.04.2025 22:27 β π 126 π 51 π¬ 4 π 1European Sympsium on Algorithms 2025 will be held in Warsaw in September, as part of ALGO 2025. Do you have great work on design and analysis of algorithms? Submit it by April 23! algo-conference.org/2025/esa/
08.04.2025 14:45 β π 4 π 1 π¬ 0 π 0I was their "guinea pig" for registration, so I can testify it's a smooth process with O(1) steps which all work with a positive probability!
04.04.2025 07:00 β π 0 π 0 π¬ 0 π 0STOC in Prague! With FOCS deadline over, there's no excuse to postpone the registration.
acm-stoc.org/stoc2025/
Taking a break from the submission season? Swing by the Workshop on Algorithms for Large Data (Online), WALDO 2025 ποΈ April 14β16: waldo-workshop.github.io/2025.html
Registration is free! (but necessary by April 7)
Aleksander {\L}ukasiewicz, Jakub T\v{e}tek, Pavel Vesel\'y
SplineSketch: Even More Accurate Quantiles with Error Guarantees
https://arxiv.org/abs/2504.01206
The upcoming trade war with the penguins, Is a good excuse to mention the following fantastic book...
en.m.wikipedia.org/wiki/War_wit...