AFAIR circa 10 years ago ApacheCommons (or other popular library) had matcher using reflection to compare two objects. I think it even supported excluding fields, but not nested exclusion.
14.07.2025 11:51 β π 1 π 0 π¬ 0 π 0@asatarin.bsky.social
Staff SRE at Google. Distributed systems / databases / reliability / correctness. Views my own. Repost / like is not an endorsement. http://asatarin.github.io
AFAIR circa 10 years ago ApacheCommons (or other popular library) had matcher using reflection to compare two objects. I think it even supported excluding fields, but not nested exclusion.
14.07.2025 11:51 β π 1 π 0 π¬ 0 π 0Calvin protocol has shards communicate read-write sets and values via messages.
10.07.2025 12:28 β π 1 π 1 π¬ 0 π 0-------
"Jepsen 18: Serializable Mom", Kyle Kingsbury/@aphyr.com , #sd25
Engineers are tasked with building towers of abstraction, building everything higher and higher above the towering tire fire that is databases.
"I professionally set those tires on fire".
It's day two of Systems Distributed, hosted by @tigerbeetle.com! I'll be liveskeeting all of the talks, except mine (at 11 AM). Since the venue is a film museum, they're setting up special posters for each talk.
#sd25
AFAIU very few disaggregated / serverless databases scale to true zero. Itβs a pretty special feature
20.06.2025 08:40 β π 1 π 0 π¬ 0 π 0Greetings from #sd25! It's a two-day single-track conference hosted by
@tigerbeetle.com. I'll be aiming to liveskeet as many of the talks as I can. Super excited to be here!
systemsdistributed.com
Are these different CPUs? E.g. oversubscribed or something? Should one compare to pure VM costs to see what the premium is for the database?
19.06.2025 08:29 β π 0 π 0 π¬ 1 π 0A new #Jepsen report! We worked with TigerBeetle to find seven crashes, elevated latencies during single-node failures, and requests which were retried forever in version 0.16.11. We found only two safety issues: missing results for queries with multiple predicates, and incorrect timestamps in a [β¦]
06.06.2025 10:53 β π 42 π 12 π¬ 1 π 2Notepad++ is awesome.
Long time ago it was the only (or one of the few) option to open 1G+ text files on windows. Not sure if still.
If you are interested in more real world experience reports @cliffclick.bsky.social talked about it
youtu.be/GEkeOHw87Sg
A small issue in Amazon RDS for PostgreSQL: at the "Repeatable Readβ isolation level, which in PostgreSQL normally means Snapshot Isolation, Amazon RDS for PostgreSQL clusters appear to exhibit Long Fork. We observed this behavior in healthy clusters, in versions ranging from 13.15 to 17.4 [β¦]
29.04.2025 14:29 β π 20 π 15 π¬ 1 π 1All the videos are up at www.hytradboi.com/2025#program now.
01.03.2025 06:17 β π 39 π 17 π¬ 0 π 0More on:
- reusing corpora of tests from existing systems
- metamorphic tests with SQLancer
- verifying control plan with fault injection and fuzzing
Added to the list
https://asatarin.github.io/testing-distributed-systems/#feldera
"Correctness at Feldera" talks about various correctness techniques, including:
- machine proof of the underlying DBSP algorithm
- differential testing of the implementation
https://www.feldera.com/blog/correctness-at-feldera
Buckle up because we're banging into the new year with my annual retrospective of the last year in databases! Highlights include license change blowback, Databricks vs. Snowflake gangwar, @duckdb.org's shotgun weddings, and buying a quarterback to impress your lover: www.cs.cmu.edu/~pavlo/blog/...
01.01.2025 14:02 β π 203 π 66 π¬ 11 π 20Curated list of materials on testing SQL database engines is public now
github.com/asatarin/tes...
Curated list of materials on testing SQL database engines is public now
github.com/asatarin/tes...
This is a sample from a list "testing-sql-databases" I have.
It's more drafty, most likely missing a ton from big tech, startups and academia alike and not published.
Good old Microsoft published some work:
- "Deploying a Steered Query Optimizer in Production at Microsoft" dl.acm.org/doi/abs/10.1...
- This great talk "The Cascades Framework for Query Optimization at Microsoft" touches on correctness youtu.be/pQe1LQJiXN0
Work on correctness of optimizers from Greenplum
- "Automatic capture of minimal, portable, and executable bug repros using AMPERe" dl.acm.org/doi/10.1145/...
- "Testing the accuracy of query optimizers" dl.acm.org/doi/10.1145/...
Similar work from Databricks:
- "SparkFuzz: searching correctness regressions in modern query engines" dl.acm.org/doi/abs/10.1...
- "Correctness and Performance of Apache Spark SQL" youtu.be/fddBOZxdUKI
You already mentioned "Snowtrail: Testing with Production Queries on a Cloud Database" from Snowflake
dl.acm.org/doi/10.1145/...
More correctness work from MongoDB, not yet on my list
"Verifying Transactional Consistency of MongoDB"
arxiv.org/abs/2111.14946
MongoDB team did great work on performance:
- "Creating a Virtuous Cycle in Performance Testing at MongoDB" dl.acm.org/doi/10.1145/...
- "The Use of Change Point Detection ..." dl.acm.org/doi/abs/10.1...
- "Fair Benchmarking Considered Difficult" mytherin.github.io/papers/2018-...
To counter balance your argument some big tech work on correctness of (SQL) database in a thread below.
Spanner has incredibly sophisticated random generated checks internally, this just scratches the surface:
- "Randomized Testing of Cloud Spanner"
medium.com/@jcorbett_26...
Get your point, but to be fair, this list is not targeting correctness of databases (SQL or otherwise).
It almost entirely excludes anything single node or targeting single threaded execution (like a fuzzer), with some late additions in
asatarin.github.io/testing-dist...
I'd be happy to learn that this is just a gap in my testing knowledge, or that there's a bunch of secret testing systems in big companies that I don't know about, but even just skimming the list of asatarin.github.io/testing-dist... kind of illustrates my point. (Thanks again @asatarin.bsky.social!)
12.12.2024 01:04 β π 5 π 1 π¬ 3 π 0Deterministic simulation has been entirely pushed by startups. Jepsen is jepsen. All the SQL fuzzing stuff I know of comes from academia or database startups. Larger companies seem to only be winning in the application of formal methods (because they can afford to hire a team of ex-professors?)
12.12.2024 01:04 β π 6 π 1 π¬ 1 π 0When I think of advancements in quality and correctness of databases, it feels most things have come from startups or individuals, and not large well-established companies. Which seems... backwards? We talk of startups hacking out code and megacorps crawling to keep a high quality bar.
12.12.2024 01:04 β π 10 π 2 π¬ 1 π 0This is a sample from a list "testing-sql-databases" I have.
It's more drafty, most likely missing a ton from big tech, startups and academia alike and not published.