Alex Miller's Avatar

Alex Miller

@alexmillerdb.bsky.social

Follow for database internals content.

1,844 Followers  |  130 Following  |  271 Posts  |  Joined: 22.10.2024  |  1.8711

Latest posts by alexmillerdb.bsky.social on Bluesky

Preview
Cloudflare Pages: Compression issues with custom hostnames Cloudflare's Status Page - Cloudflare Pages: Compression issues with custom hostnames.

If you have a blog hosted on cloudflare pages and any part of your css looks missing only on safari, it's because of www.cloudflarestatus.com/incidents/ps..., and you have to go purge the cache to get rid of the cached wrongly-compressed asset files.

04.08.2025 20:33 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

Philz funding round vs post-money valuation history isn’t looking too great

03.08.2025 19:17 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Also note that TPS / 60 != tpmC, and it turns out that they're correct about that, and the TPC-C spec does define it as the number of completed (or rolled back) new order transactions only.

03.08.2025 05:55 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

Not only did GaussDB run TPC-C wrong (without stating so!), they gave cockroach a capitalized R.

> We compared the end-to-end performance of GaussDB with System-X and CockRoachDB on TPC-C with 10,000 warehouses.

03.08.2025 05:48 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Captcha Check

@graydon-pub.bsky.social, re: consensus graydon2.dreamwidth.org/319018.html, I think you'd also really enjoy ^ if you hadn't seen it before.

02.08.2025 19:06 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
A Failed Experiment with Siso

I gave up, and finally pushed all my notes to transactional.blog/blog/2025-mo.... It has how to get the remote execution *to* buildbuddy working, but I think siso devs will need to add some better path munging support before it can actually work as a replacement for CMake's use of ninja.

01.08.2025 20:19 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Solution: JSONTiles!

Sneller did a 16-way bucketed partitioning of JSON in Amazon Ion format for their analytics, which I guess worked out okay, but JSONTiles is cooler.

31.07.2025 17:33 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Three fundamental flaws of SIMD ISA:s – Bits'n'Bites

A good read on other, nicer ways that ISAs can represent vectorized loops: www.bitsnbites.eu/three-fundam...

30.07.2025 18:26 β€” πŸ‘ 5    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

I would be ever grateful if someone would be willing to go build a wrapper for PKGBUILD that is detached from any particular distro so that we can do away with Vcpkg/Conan/WrapDB/etc. and just re-use the same effort that distros put into packaging C/C++ dependencies.

30.07.2025 18:13 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

In line with previous research [16, 29], we set the think/keying time in TPC-C to zero.

30.07.2025 05:42 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

I wish the bsky recommendations feed was this accurate for me. Thanks!

29.07.2025 19:57 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

If you had the authors for a podcast, I’d be a lot more interested to hear about what evaluations didn’t make the paper, what pivots happened to the original design, what the author was surprised about in the behavior of their or the baseline system, etc. Everything one doesn’t include in a paper.

29.07.2025 00:18 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Bounding retries of write-write conflicts in MVCC+2PL | snowytrees.dev

I wrote up on bounding retries of write-write conflicts in MVCC+2PL. There is surprisingly little publicly written on how databases can retry, so doing my part to fill that gap in :) snowytrees.dev/blog/boundin...

25.07.2025 17:16 β€” πŸ‘ 7    πŸ” 3    πŸ’¬ 0    πŸ“Œ 0

Looking forward to your new series!

Criticizing papers in public without it possibly feeling like a non-consensual roasting to the author is a line that I don’t know how to navigate though…

25.07.2025 00:12 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
South Bay Systems: ChatGPT Ain’t Got $%@& On Me! The Future of Automated Database Tuning Β· Luma We're excited to feature Andy Pavlo, illustrious database professor at CMU, to talk about database tuning. This meetup's venue, food and drinks, are generously…

Attention, South Bay folk! We have The Databaseologist, @andypavlo.bsky.social, giving a talk in the bay on August 6th. Come join us for a great time in hearing:

ChatGPT Ain’t Got $%@& On Me! The Future of Automated Database Tuning

Register now! https://lu.ma/ha0dc4nj

23.07.2025 21:26 β€” πŸ‘ 13    πŸ” 2    πŸ’¬ 0    πŸ“Œ 2

AFAICT it’s about double. XXH3 comes in at about 28GB/s in cache on my 5 year old laptop. I see benchmarks posted with 52GB/s for crc-nvme using VPCLMULQDQ. I’d still vote for XXH3 for RocksDB though as CRC guaranteeing single bitflip detection is less useful when your flash is already doing ECC.

23.07.2025 06:03 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 2    πŸ“Œ 0
Preview
Consensus algorithms at scale: Part 1 - Introduction β€” PlanetScale This is a multi-part blog series and will be updated with links to the corresponding posts.

I had missed @ssougou.bsky.social's blog post series on consensus when it was originally posted. I really like the perspective of breaking down Raft/Paxos/etc. into the individual actions that comprise consensus.
planetscale.com/blog...

20.07.2025 18:59 β€” πŸ‘ 35    πŸ” 5    πŸ’¬ 1    πŸ“Œ 0

If they're willing to also sponsor some food or drinks that's doubly amazing. We've had a few places already volunteer, but it'd be nice to not strain their generosity. :)

17.07.2025 02:29 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

To South Bay folks:

South Bay Systems is currently limited by finding venues willing to host. If you know of somewhere that can host 60+ people and is happy getting a few minutes for a quick "here's who we are, what we do, and who to talk to for hiring" in return, please let me know!

17.07.2025 02:29 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 1

For some reason that doesn’t seem to help? github.com/cmu-db/bench... has done a lot of including new proposed benchmarks, but I almost never see either the tool nor its benchmarks used?

17.07.2025 00:27 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Generating and loading datasets is also a significant hurdle to solve, and even tools for that alone are of great help. See, for example, datafusion.apache.org/blog/2025/04... introducing github.com/clflushopt/t...

16.07.2025 22:49 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Making it easy to do the right thing means people will do the right thing, and I strongly suspect that it a tpc-rs project appeared in which the defaults all did the standards compliant thing, it'd get a lot of users quickly.

16.07.2025 22:46 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

The easiest things to use (sysbench-tpcc, go-tpc, pgbench) are all non standards confirming, but they're the easiest things to use! Credit to PingCAP that go-tpc is the closest (it has TPC-C keying time!), but keeping the required 10:1 terminal:warehouse ratio isn't inherent.

16.07.2025 22:45 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

If you'd like to do something great for the database community, write an easy to use implementation of the TPC-* benchmarks. Bonus points if it comes with blog posts describing what it is in a database that each TPC-* benchmark is actually testing.

16.07.2025 22:45 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 2    πŸ“Œ 0

No worries! DDCG post was of great help already!

16.07.2025 05:48 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

I've been looking at Copy-and-Patch, which suggests also having superstencils that cover combinations of opcodes, and incorporating those efficiently seems like the same problem as doing good instruction selection? I've seen iburg suggested, but I'm not a compiler person, so idk what I'm doin

15.07.2025 21:34 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

@bernsteinbear.com Would you happen to have a suggestion for an easy instruction selection algorithm, sort of like the recommendation of DDCG for quick and easy register allocation?

15.07.2025 21:34 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Calvin, Fauna, and now Accord can all have an argument that they support interactivity because you can execute against a stale snapshot, and then submit a transaction which re-validates your read set OCC style. So you can build interactivity on them, but the deterministic db core can’t.

15.07.2025 17:59 β€” πŸ‘ 4    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

VoltDB strongly encourages you to act like it doesn’t have interactive transactions docs.voltdb.com/UsingVoltDB/...

Datomic is just unusual overall. jepsen.io/analyses/dat... 1.2 Transaction Model is the simplest overview I saw.

15.07.2025 17:59 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

A distributed systems reliability glossary

jepsen x antithesis

antithesis.com/resources/re...

15.07.2025 15:57 β€” πŸ‘ 31    πŸ” 5    πŸ’¬ 1    πŸ“Œ 1

@alexmillerdb is following 20 prominent accounts