King to c7?
20.12.2024 16:22 β π 2 π 0 π¬ 0 π 0@swe.dev.bsky.social
Co-founder/CTO @fennel.ai / Databases #DataBS / Distributed Systems / Infrastructure. @bothra90 on Twitter.
King to c7?
20.12.2024 16:22 β π 2 π 0 π¬ 0 π 0Caveat: Some of these could be unique to Fennelβs architecture because of our reliance on Kafka for exactly-once semantics and recovery
06.12.2024 05:43 β π 0 π 0 π¬ 0 π 0Why use large batches at all? To amortize the cost of Kafka transactions, which we rely on for exactly-once semantics.
06.12.2024 05:43 β π 0 π 0 π¬ 1 π 0The latter also keeps memory utilization proportional to mini-batch size.
06.12.2024 05:43 β π 0 π 0 π¬ 1 π 0We got around that by internally sharding each batch of records and processing sub-shards in parallel.
We also break down our batches into mini-batches so output of the chain can be streamed to Kafka without waiting for the full batch execution to finish.
Cons: This architecture prevents concurrent/fully async operation of all operators since now each batch has to be processed in full by the operator chain before moving to the next batch, which was in turn preventing us from running full throttle even when CPU capacity was available.
06.12.2024 05:43 β π 0 π 0 π¬ 1 π 0Great thread from @micahw.com. Adding some of our own learnings from building this in Fennel.
An additional advantage for us was that it allowed us to keep data in columnar format for longer instead of converting back-and-forth between operators for serialization.
In hindsight, what would the right API for this look like?
27.11.2024 20:29 β π 1 π 0 π¬ 1 π 0Yes, I think they do this so that the βaβ region doesnβt become a hotspot. Was definitely surprising when I found out, but ultimately made sense.
27.11.2024 19:44 β π 3 π 0 π¬ 0 π 0Clusters are getting squeezed from above by smarter control planes, and from below by cheap and consistent object storage.
www.linkedin.com/pulse/contro...
it occupies a very interesting point in the design space of caches, but the fact that you canβt immediately read your writes can be a problem that you still need to design for. I wonder if that is its undoing.
@jonhoo.eu might have more thoughts on this.
That was their implementation of Noria?
20.11.2024 08:10 β π 1 π 0 π¬ 1 π 0Weβve built an IVM engine at Fennel that allows python UDFs by leveraging a fleet of python workers for execution while keeping the other operators in Rust. Hope to write a lot more about the technical details soon. One problem that weβve had to solve is to provide IVM with time travel.
20.11.2024 07:59 β π 3 π 0 π¬ 0 π 0TIL AWS un-launched S3 Select[1] as of July 25, 2024, presumably in favor of S3 Object Lambda[2]. RIP PushdownDB (arxiv.org/abs/2002.0...).
[1]: aws.amazon.com/blogs...
[2]: aws.amazon.com/s3/fe...