Rob Patro's Avatar

Rob Patro

@robp.bsky.social

Associate Professor of CS @ University of Maryland. Proud Rust advocate! I ♥ science & compiled, statically-typed programming languages! Views are my own. Tech stack: https://github.com/rob-p/tech-stack.

4,004 Followers  |  536 Following  |  1,937 Posts  |  Joined: 17.06.2023  |  2.3515

Latest posts by robp.bsky.social on Bluesky

"That they see as authoritarian"? No, his actions are unequivocally authoritarian...

18.10.2025 16:38 — 👍 1    🔁 0    💬 0    📌 0

Ohh no; will they have to stop making constitutionally braindead decisions that defy logic and precedent?

search.app/jGRD3

18.10.2025 01:30 — 👍 1    🔁 0    💬 0    📌 0

If you were a foreign adversary nation directing the coordinated downfall of the US, this is basically exactly what it would look like.

17.10.2025 23:13 — 👍 3    🔁 0    💬 0    📌 0

👏

17.10.2025 21:10 — 👍 0    🔁 0    💬 0    📌 0
Preview
UVA rejects Trump administration’s ‘Compact for Academic Excellence' Mahoney: "We will continue to work to strengthen free expression and free inquiry in an increasingly polarized world.

UVA rejects the compact. www.vpm.org/news/2025-10...

17.10.2025 21:09 — 👍 12    🔁 4    💬 1    📌 0

The details of the agreement were not made available to the plebe faculty, but I assume yes... "We are bullshit security thing certified and so you should waste an insane amount of money on our trash product" or something thereabouts.

17.10.2025 15:19 — 👍 0    🔁 0    💬 1    📌 0

F*ing SalesForce. They call it SalesForce because you are *forced* to use their software. We use this software for PhD admissions at UMD and it sucks! Waaay worse than the software cooked up by one of our own faculty (alone) like a dozen years ago (which the Uni forced us to drop).

17.10.2025 14:57 — 👍 3    🔁 0    💬 2    📌 0

We're looking for an instructor for the algorithms course in our Bioinformatics MS program. The course assets have already been made (by me) and used in several previous offerings, but we need an instructor! If you're in the DMV area, check it out: www.linkedin.com/posts/robert...

17.10.2025 14:55 — 👍 5    🔁 3    💬 0    📌 0
Preview
Theoretical Analysis of Sequencing Bioinformatics Algorithms and Beyond | Communications of the ACM A case study reveals the theoretical analysis of algorithms is not always as helpful as standard dogma might suggest.

Also, the point about the standard focus on worst-case complexity on arbitrary instances is a good one. It shares similarities with @pashadag.bsky.social's work in the context of edit distance algorithms: dl.acm.org/doi/10.1145/..., dl.acm.org/doi/10.1145/...

17.10.2025 14:11 — 👍 3    🔁 0    💬 0    📌 0

A follow up question is, "is the easy clustering relevant to the question I'm asking?". Perhaps not as easy to formalize, but important nonetheless.

17.10.2025 14:09 — 👍 1    🔁 0    💬 1    📌 0

I don't think I'm arguing against that... at least that's not my intent. Rather, I am arguing that there _are_ meaningful notions of similarity and that, at least some of the time, when clustering fails, it does so because we have not put sufficient thought or effort into crafting the right notion.

17.10.2025 12:59 — 👍 1    🔁 0    💬 1    📌 0

I should also say, as your initial post seems to hint, ppl often want a "good" clustering under a "meaningful" similarity measure, but don't expend the effort to think what these terms mean. I think that is almost always a bad idea. In that case, spend more time thinking about the question!

16.10.2025 17:40 — 👍 2    🔁 0    💬 1    📌 0

Thus, we'd like the ability to assess the similarity of objects under different metrics. For the second one (i.e. "good") I mostly agree. While, perhaps, there are different notions that make sense, for the most part, the lack of a single quality metric being optimized is a bit concerning. 2/2

16.10.2025 17:38 — 👍 0    🔁 0    💬 1    📌 0

I think "different metrics" are essential. The example you gave initially is a great one for that. Am I trying to group animals by the safety precautions I need to take to keep them in a zoo, or by there genetic relatedness? Those are both valid in different contexts. 1/2

16.10.2025 17:38 — 👍 0    🔁 0    💬 1    📌 0

there are different metrics of cluster quality (& procedures to optimize them) that define what constitutes "good" clusterings from a mathematical perspective. While they may not all be intuitive, some are, & they are at least well-defined. Often, though, people go with vibes for both choices 😥. 2/2

16.10.2025 16:25 — 👍 0    🔁 0    💬 1    📌 0

Right, so I agree there are 2 issues here. I view one is more "fundamental" and "problematic" than the other. If you don't know what similarity you want to measure, then I'd argue that coming up with a "good" clustering is hopeless / misguided. OTOH 1/2

16.10.2025 16:25 — 👍 3    🔁 0    💬 1    📌 0

Well, devil's advocate point. If you can define the meaningful distance, then clustering has a rather natural interpretation. Isn't this one of the key ideas of semantic embedding techniques? To me the key questions are "what similarity matters" & "do I have an accurate quantitative measure of it?"

16.10.2025 13:40 — 👍 1    🔁 0    💬 2    📌 0

Do people often perform clustering with you without well defined cost functions 🤣? What type of similarity do you want to matter; isn't that always the first question?

16.10.2025 12:49 — 👍 1    🔁 0    💬 1    📌 0
Preview
Student Research Day 2025 - Biomedical Graduate Education Wednesday, October 15, 2025 Since 1985, Student Research Day has provided an on-campus forum for students to showcase their research pursuits. For students in School of Medicine Ph.D. programs and rel...

Excited to be speaking at the Georgetown Biomedical Student Research Day biomedicalprograms.georgetown.edu/students/stu...! It was particularly rewarding to be invited by a prev student from my bioinfo algo class. Looking forward to talking about the importance of free & open scientific software!

15.10.2025 14:12 — 👍 6    🔁 0    💬 0    📌 0

Drop down to inline asm? Anything else and I might be too afraid codegen would change over compiler versions.

15.10.2025 12:42 — 👍 1    🔁 0    💬 1    📌 0
PheedLoop PheedLoop: Hybrid, In-Person & Virtual Event Software

Getting excited for #ASHG25. The must see talk is tomorrow at 1:30 by Qiuhui (Iris) Li presenting the first large-scale integration of long-read genome sequencing with electronic health record data across >1000 All of Us participants meetings.ashg.org/event/ASHG25...

14.10.2025 17:40 — 👍 43    🔁 9    💬 1    📌 0
Illustration of Burrows-Wheeler Transform and many auxiliary structures from the input string how$now$brown$cow$#

Illustration of Burrows-Wheeler Transform and many auxiliary structures from the input string how$now$brown$cow$#

New tool "bwt-svg" for making illustrations of the BWT and the many auxiliary arrays and other structures related to it. Pyodide-based no-installation-necessary interface here: benlangmead.github.io/bwt-svg/. (H/t to @robert.bio for pointing me to pyodide!) Full repo: github.com/benlangmead/....

14.10.2025 20:48 — 👍 39    🔁 20    💬 4    📌 1
Preview
P99 CONF Event 2025 – All Things Performance On-Demand P99 CONF is a cross-industry virtual event for _engineers_ and by engineers, centered around low-latency, high-performance design.

I'm having a talk on "40x faster binary search" at #p99conf next Wednesday, Oct 22!

Register now :)

p99conf.io

14.10.2025 15:43 — 👍 9    🔁 1    💬 0    📌 0
Preview
GitHub - rchikhi/cuttlefish: Cuttlefish2 except that it also output approximate mean k+1-mer counts on the unitigs, and many other specific optimisations for my use case Cuttlefish2 except that it also output approximate mean k+1-mer counts on the unitigs, and many other specific optimisations for my use case - rchikhi/cuttlefish

Big scalability improvements, as well as support for colored-compacted dBGs from read sets (with an efficient color repr). The framework might be able to handle abundances too, but we've not focused on it. @rayanchikhi.bsky.social has a Cuttlefish2 fork with abundances github.com/rchikhi/cutt... !

14.10.2025 01:12 — 👍 1    🔁 0    💬 0    📌 0

Thanks, Titus! Can't wait for us to get Cuttlefish 3 published (we have nice improvements over even the preprint), but it's been an uphill battle.

14.10.2025 00:57 — 👍 1    🔁 0    💬 1    📌 0
Preview
Scalable, ultra-fast, and low-memory construction of compacted de Bruijn graphs with Cuttlefish 2 - Genome Biology The de Bruijn graph is a key data structure in modern computational genomics, and construction of its compacted variant resides upstream of many genomic analyses. As the quantity of genomic data grows rapidly, this often forms a computational bottleneck. We present Cuttlefish 2, significantly advancing the state-of-the-art for this problem. On a commodity server, it reduces the graph construction time for 661K bacterial genomes, of size 2.58Tbp, from 4.5 days to 17–23 h; and it constructs the graph for 1.52Tbp white spruce reads in approximately 10 h, while the closest competitor requires 54–58 h, using considerably more memory.

Reading the Cuttlefish 2 paper and I really admire the clarity and simplicity of the introduction. Makes me understand a few things in new, different, and productive ways. Lovely.

12.10.2025 14:15 — 👍 28    🔁 5    💬 1    📌 0

Not as long as Sunday was for Ravens fans... At least the Bills are actually good this year (so far).

13.10.2025 23:22 — 👍 1    🔁 0    💬 1    📌 0

Yea, that happens. But the planes almost always make it! Just a little bumpy (and no cabin service).

13.10.2025 21:45 — 👍 1    🔁 0    💬 1    📌 0

C'mon, you're just getting started enjoying seasons! Winter should be a good one.

13.10.2025 21:40 — 👍 1    🔁 0    💬 1    📌 0

As I've been saying from the start; literal monarchists.

13.10.2025 14:26 — 👍 4    🔁 0    💬 0    📌 0

@robp is following 20 prominent accounts