Andrey Satarin's Avatar

Andrey Satarin

@asatarin.bsky.social

Staff SRE at Google. Distributed systems / databases / reliability / correctness. Views my own. Repost / like is not an endorsement. http://asatarin.github.io

526 Followers  |  143 Following  |  37 Posts  |  Joined: 15.11.2024  |  2.3092

Latest posts by asatarin.bsky.social on Bluesky

AFAIR circa 10 years ago ApacheCommons (or other popular library) had matcher using reflection to compare two objects. I think it even supported excluding fields, but not nested exclusion.

14.07.2025 11:51 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Calvin protocol has shards communicate read-write sets and values via messages.

10.07.2025 12:28 β€” πŸ‘ 1    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

-------

"Jepsen 18: Serializable Mom", Kyle Kingsbury/@aphyr.com , #sd25

Engineers are tasked with building towers of abstraction, building everything higher and higher above the towering tire fire that is databases.

"I professionally set those tires on fire".

20.06.2025 13:22 β€” πŸ‘ 4    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0
Post image

It's day two of Systems Distributed, hosted by @tigerbeetle.com! I'll be liveskeeting all of the talks, except mine (at 11 AM). Since the venue is a film museum, they're setting up special posters for each talk.

#sd25

20.06.2025 06:36 β€” πŸ‘ 36    πŸ” 4    πŸ’¬ 2    πŸ“Œ 1

AFAIU very few disaggregated / serverless databases scale to true zero. It’s a pretty special feature

20.06.2025 08:40 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Systems Distributed '25 A conference to teach systems programming and thinking, and how to apply these ideas. All the way across the stack. From systems languages and compilers, to databases and distributed systems.

Greetings from #sd25! It's a two-day single-track conference hosted by
@tigerbeetle.com. I'll be aiming to liveskeet as many of the talks as I can. Super excited to be here!

systemsdistributed.com

19.06.2025 06:25 β€” πŸ‘ 65    πŸ” 13    πŸ’¬ 2    πŸ“Œ 1

Are these different CPUs? E.g. oversubscribed or something? Should one compare to pure VM costs to see what the premium is for the database?

19.06.2025 08:29 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Original post on mastodon.jepsen.io

A new #Jepsen report! We worked with TigerBeetle to find seven crashes, elevated latencies during single-node failures, and requests which were retried forever in version 0.16.11. We found only two safety issues: missing results for queries with multiple predicates, and incorrect timestamps in a […]

06.06.2025 10:53 β€” πŸ‘ 42    πŸ” 12    πŸ’¬ 1    πŸ“Œ 2

Notepad++ is awesome.

Long time ago it was the only (or one of the few) option to open 1G+ text files on windows. Not sure if still.

29.05.2025 01:28 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Cliff Click β€” The Azul Hardware Transactional Memory experience
YouTube video by Hydra Cliff Click β€” The Azul Hardware Transactional Memory experience

If you are interested in more real world experience reports @cliffclick.bsky.social talked about it

youtu.be/GEkeOHw87Sg

17.05.2025 22:45 β€” πŸ‘ 3    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0
Original post on mastodon.jepsen.io

A small issue in Amazon RDS for PostgreSQL: at the "Repeatable Read” isolation level, which in PostgreSQL normally means Snapshot Isolation, Amazon RDS for PostgreSQL clusters appear to exhibit Long Fork. We observed this behavior in healthy clusters, in versions ranging from 13.15 to 17.4 […]

29.04.2025 14:29 β€” πŸ‘ 20    πŸ” 15    πŸ’¬ 1    πŸ“Œ 1
HYTRADBOI 2025 HYTRADBOI is a fun online conference about databases, programming languages, and everything in between.

All the videos are up at www.hytradboi.com/2025#program now.

01.03.2025 06:17 β€” πŸ‘ 39    πŸ” 17    πŸ’¬ 0    πŸ“Œ 0
Preview
Testing Distributed Systems Curated list of resources on testing distributed systems

More on:
- reusing corpora of tests from existing systems
- metamorphic tests with SQLancer
- verifying control plan with fault injection and fuzzing

Added to the list
https://asatarin.github.io/testing-distributed-systems/#feldera

13.01.2025 16:49 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
Correctness at Feldera In this blog post, we briefly describe our efforts and development processes that ensure Feldera's engine is correct.

"Correctness at Feldera" talks about various correctness techniques, including:
- machine proof of the underlying DBSP algorithm
- differential testing of the implementation

https://www.feldera.com/blog/correctness-at-feldera

13.01.2025 16:49 β€” πŸ‘ 4    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0
Preview
Databases in 2024: A Year in Review Andy rises from the ashes of his dead startup and discusses what happened in 2024 in the database game.

Buckle up because we're banging into the new year with my annual retrospective of the last year in databases! Highlights include license change blowback, Databricks vs. Snowflake gangwar, @duckdb.org's shotgun weddings, and buying a quarterback to impress your lover: www.cs.cmu.edu/~pavlo/blog/...

01.01.2025 14:02 β€” πŸ‘ 203    πŸ” 66    πŸ’¬ 11    πŸ“Œ 20
Preview
GitHub - asatarin/testing-sql-databases: Curated list of materials on testing SQL database engines Curated list of materials on testing SQL database engines - asatarin/testing-sql-databases

Curated list of materials on testing SQL database engines is public now

github.com/asatarin/tes...

12.12.2024 06:43 β€” πŸ‘ 14    πŸ” 5    πŸ’¬ 0    πŸ“Œ 0
Preview
GitHub - asatarin/testing-sql-databases: Curated list of materials on testing SQL database engines Curated list of materials on testing SQL database engines - asatarin/testing-sql-databases

Curated list of materials on testing SQL database engines is public now

github.com/asatarin/tes...

12.12.2024 06:43 β€” πŸ‘ 14    πŸ” 5    πŸ’¬ 0    πŸ“Œ 0

This is a sample from a list "testing-sql-databases" I have.

It's more drafty, most likely missing a ton from big tech, startups and academia alike and not published.

12.12.2024 06:35 β€” πŸ‘ 3    πŸ” 1    πŸ’¬ 1    πŸ“Œ 1
Preview
Deploying a Steered Query Optimizer in Production at Microsoft | Proceedings of the 2022 International Conference on Management of Data You will be notified whenever a record that you have chosen has been cited.

Good old Microsoft published some work:
- "Deploying a Steered Query Optimizer in Production at Microsoft" dl.acm.org/doi/abs/10.1...
- This great talk "The Cascades Framework for Query Optimization at Microsoft" touches on correctness youtu.be/pQe1LQJiXN0

12.12.2024 06:35 β€” πŸ‘ 2    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0
Preview
Automatic capture of minimal, portable, and executable bug repros using AMPERe | Proceedings of the Fifth International Workshop on Testing Database Systems You will be notified whenever a record that you have chosen has been cited.

Work on correctness of optimizers from Greenplum
- "Automatic capture of minimal, portable, and executable bug repros using AMPERe" dl.acm.org/doi/10.1145/...
- "Testing the accuracy of query optimizers" dl.acm.org/doi/10.1145/...

12.12.2024 06:35 β€” πŸ‘ 2    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0
SparkFuzz | Proceedings of the workshop on Testing Database Systems

Similar work from Databricks:
- "SparkFuzz: searching correctness regressions in modern query engines" dl.acm.org/doi/abs/10.1...
- "Correctness and Performance of Apache Spark SQL" youtu.be/fddBOZxdUKI

12.12.2024 06:35 β€” πŸ‘ 3    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0
Snowtrail | Proceedings of the Workshop on Testing Database Systems

You already mentioned "Snowtrail: Testing with Production Queries on a Cloud Database" from Snowflake
dl.acm.org/doi/10.1145/...

12.12.2024 06:35 β€” πŸ‘ 3    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0
Preview
Verifying Transactional Consistency of MongoDB MongoDB is a popular general-purpose, document-oriented, distributed NoSQL database. It supports transactions in three different deployments: single-document transactions utilizing the WiredTiger stor...

More correctness work from MongoDB, not yet on my list
"Verifying Transactional Consistency of MongoDB"
arxiv.org/abs/2111.14946

12.12.2024 06:35 β€” πŸ‘ 3    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0
Preview
Creating a Virtuous Cycle in Performance Testing at MongoDB | Proceedings of the ACM/SPEC International Conference on Performance Engineering

MongoDB team did great work on performance:
- "Creating a Virtuous Cycle in Performance Testing at MongoDB" dl.acm.org/doi/10.1145/...
- "The Use of Change Point Detection ..." dl.acm.org/doi/abs/10.1...
- "Fair Benchmarking Considered Difficult" mytherin.github.io/papers/2018-...

12.12.2024 06:35 β€” πŸ‘ 3    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0
Randomized Testing of Cloud Spanner One of the secrets behind Cloud Spanner quality is randomized testing. SQL databases like Cloud Spanner have complex APIs. Complete unit…

To counter balance your argument some big tech work on correctness of (SQL) database in a thread below.

Spanner has incredibly sophisticated random generated checks internally, this just scratches the surface:
- "Randomized Testing of Cloud Spanner"
medium.com/@jcorbett_26...

12.12.2024 06:35 β€” πŸ‘ 5    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0
Preview
Testing Distributed Systems Curated list of resources on testing distributed systems

Get your point, but to be fair, this list is not targeting correctness of databases (SQL or otherwise).

It almost entirely excludes anything single node or targeting single threaded execution (like a fuzzer), with some late additions in
asatarin.github.io/testing-dist...

12.12.2024 06:35 β€” πŸ‘ 3    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0
Preview
Testing Distributed Systems Curated list of resources on testing distributed systems

I'd be happy to learn that this is just a gap in my testing knowledge, or that there's a bunch of secret testing systems in big companies that I don't know about, but even just skimming the list of asatarin.github.io/testing-dist... kind of illustrates my point. (Thanks again @asatarin.bsky.social!)

12.12.2024 01:04 β€” πŸ‘ 5    πŸ” 1    πŸ’¬ 3    πŸ“Œ 0

Deterministic simulation has been entirely pushed by startups. Jepsen is jepsen. All the SQL fuzzing stuff I know of comes from academia or database startups. Larger companies seem to only be winning in the application of formal methods (because they can afford to hire a team of ex-professors?)

12.12.2024 01:04 β€” πŸ‘ 6    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0

When I think of advancements in quality and correctness of databases, it feels most things have come from startups or individuals, and not large well-established companies. Which seems... backwards? We talk of startups hacking out code and megacorps crawling to keep a high quality bar.

12.12.2024 01:04 β€” πŸ‘ 10    πŸ” 2    πŸ’¬ 1    πŸ“Œ 0

This is a sample from a list "testing-sql-databases" I have.

It's more drafty, most likely missing a ton from big tech, startups and academia alike and not published.

12.12.2024 06:35 β€” πŸ‘ 3    πŸ” 1    πŸ’¬ 1    πŸ“Œ 1

@asatarin is following 20 prominent accounts