Love the idea. Could some of these eventually become sub projects, and hosted in the SlateDB organization as a separate repo? Starting projects that have that potential as GitHub issues with a specific tag would make it easy to track.
07.07.2025 01:37 β π 1 π 0 π¬ 1 π 0
Insane amount of SlateDB work going on:
- snapshot reads
- split/merge DBs (zero copy)
- deterministic simulation testing
And someone just pushed Python bindings in a PR! π€―
18.06.2025 14:48 β π 9 π 3 π¬ 0 π 0
YouTube video by Data Council
Internals of SlateDB: An Embedded Key Value Store Built On Object Storage
My Data council talk on SlateDB.
youtu.be/gcTRXZeKbNg?...
30.05.2025 05:33 β π 22 π 3 π¬ 0 π 0
Got it. So, if I wanted a view to update, say once an hour incrementally, would I create a "hourly view" that uses now() and join against it?
27.05.2025 16:53 β π 2 π 0 π¬ 1 π 0
Clock tick as an input is indeed a way to model it! Would the clock tick table be joined in all views that need this property?
27.05.2025 14:45 β π 0 π 0 π¬ 1 π 0
Finally got to read this.
One additional aspect to ivm, is reasoning about the data in the computed. For a lot of use cases, it is often easy to think of a view/table to move in predictable increments (day, hour, 15 minutes etc). This notion is not modeled as a first class concept in many.
26.05.2025 23:28 β π 1 π 0 π¬ 1 π 0
SlateDB 0.6.0 is out!
github.com/slatedb/slat...
Highlights include a hybrid cache (using Foyer), a lot of internal cleanup, and more groundwork for transactions.
Oh, and put performance jumped ~80% for write-heavy workloads :)
slatedb.io/performance/...
24.04.2025 19:04 β π 8 π 1 π¬ 0 π 0
SlateDB - An embedded storage engine built on object storage | SlateDB
Description will go into a meta tag in <head />
Today marks SlateDBβs one year anniversary! Itβs been a lot of fun. Thanks to @rohanpd.bsky.social @flaneur2024.bsky.social @almog.ai @vigneshc.bsky.social @paulbutler.org Jason Gustafson, David Moravek, and many others for joining the project. π
22.04.2025 21:55 β π 15 π 5 π¬ 0 π 1
π Commonhaus Turns One β A Look Back, and the Road Ahead
Commonhaus Foundation celebrates its first anniversary and lays down expectations for its future
Commonhaus is 1! π
14 projects, solid foundations, and more on the way.
If you believe in light governance, shared care, and thoughtful support for open source, come see what weβre building.
www.commonhaus.org/activity/253...
10.04.2025 14:05 β π 31 π 19 π¬ 0 π 1
Yo SF Bay Area #databs crew, want to talk lakehouses at a real Lake House? :)
Next week after Data Council, join the founders of @clickhouse.com, @motherduck.com, @startreedata.bsky.social, and @tobikodata.com to talk real-time databases and next-generation ETL.
www.rilldata.com/events/data-...
15.04.2025 23:44 β π 9 π 3 π¬ 1 π 0
Release v0.5.0 Β· slatedb/slatedb
What's Changed
Refactor Block Tests to Use Table-Driven Test Cases by @samsond in #410
Update await calls in README.md by @criccomini in #425
chore: Apply table driven test for sst.rs by @jeffreyl...
SlateDB 0.5.0 is out!
Features:
- Checkpoints
- Clones
- Read only client
- Split/merge database foundation
- TTL filtering on reads
- Last version with breaking byte format changes
By the numbers:
- 62 commits
- 2 new contributors
- 10 total contributors
github.com/slatedb/slat...
17.03.2025 17:23 β π 22 π 3 π¬ 2 π 1
CALL FOR GRAND CHALLENGE SOLUTIONS
DEBS2025
DEBS conference hosts a grand challenge every year. This year's challenge is detecting outliers in a stream of images from laser powder bed fusion.
The challenge involves submitting a kubernetes app (constraint: 2 cores 8 gb). Interesting to try if you have the time!
2025.debs.org/call-for-gra...
23.02.2025 18:40 β π 1 π 0 π¬ 0 π 0
Great episode!
Towards the end @vanlightly.bsky.social mentions about alloytools.org finding a data model bug.
Never thought of an intersection between data model and formal verification. Do you have more details on this?
15.02.2025 04:06 β π 0 π 0 π¬ 1 π 0
Python Folks - which data/workflow engine has the best developer experience for packaging code? We have looked into - Modal, Beam, Airflow, Flyte, AWS Lambda, Prefect, Dagster and Spark. Havenβt seen any approach which is fast, reliable and intuitive.
17.12.2024 16:09 β π 9 π 2 π¬ 6 π 0
YouTube video by BDB
Big Data Bellevue: Apache Gluten: Accelerating SparkSQL with Spark on Velox
Great talk by Binwei Yang on Apache Gluten last week.
youtu.be/GWTj3INSzPg?...
Apache Gluten moves execution of spark operators to native backend like Velox, accelerating query performance.
It has basic iceberg support too!
github.com/apache/incub...
19.01.2025 02:06 β π 1 π 0 π¬ 0 π 0
This book was on my list for the year, joining!
11.01.2025 19:20 β π 2 π 0 π¬ 0 π 0
SlateDB 0.4.0 is out!
Features:
- Range scans
- No DynamoDB needed for S3
- Nightly perf tests
- Merge operator groundwork
- GC improvements
By the numbers:
- 57 commits
- 5 new contributors
- 11 total contributors
github.com/slatedb/slat...
31.12.2024 18:50 β π 26 π 4 π¬ 3 π 0
Finally, the blog from 2023 says it isn't used in production yet. Any recent data points on production experience that can be shared now? :)
15.12.2024 21:09 β π 1 π 0 π¬ 1 π 0
What do you think about flink materialized views or dynamic tables? Can the hoptimator concept (i.e. provision the flink and other internal + external connectors to do what user asked) be part of flink eventually?
15.12.2024 21:08 β π 0 π 0 π¬ 1 π 0
Declarative Data Pipelines with Hoptimator
I finally understood what it is after reading this blog :)
www.linkedin.com/blog/enginee...
Developer experience / testing is one of the hard aspect of declarative pipelines. Pipelines, while they take more steps, is more predictable. Is there a snappy preview builtin, to mitigate some of this?
15.12.2024 21:06 β π 0 π 0 π¬ 1 π 0
Great writeup!
Did not realize flink CDC is finally just another flink job, and that it can use any of the debezium source connectors!
On iceberg, I do see a flink connector in the iceberg project. What needs to happen to make flink iceberg work with flink CDC out of the box (at par with Kafka)
12.12.2024 15:46 β π 1 π 0 π¬ 0 π 0
Great write up!
Looking at the apis, it only deals with the catalog aspect. Writing the manifest file etc are still the responsibility of client. Is the writes in systems like rust client, materialize difficult because of the lack of these apis or is writing metadata etc hard as well?
04.12.2024 13:01 β π 1 π 0 π¬ 1 π 0
I don't understand it as well. Documentation has missing parts. Example shows a spark package, specifically for S3 Tables. It might be an implementation of the catalog. Apis that backs this implementation is not in the documentation yet.
03.12.2024 20:09 β π 2 π 0 π¬ 0 π 0
Paimon is already the second version, with first one being flink table store. Paimon documentation explicitly says not to use it without a cluster framework like flink, unlike iceberg and delta which are building kernel libraries. There is no mention of Paimon in fluss,likely not a evolution.
29.11.2024 19:47 β π 1 π 0 π¬ 0 π 0
Vortex: A Stream-oriented Storage Engine For Big Data Analytics
I think it plays into a similar space as the following
research.google/pubs/vortex-... storage API and background optimizations to iceberg
www.confluent.io/blog/introdu... - likely the backend of this solves some of the same problems.
29.11.2024 19:45 β π 4 π 1 π¬ 1 π 0
Was the post commit sequential tests or did it combine multiple PRs together? Was there issues related to having to revert multiple commits due to an issue with one?
26.11.2024 17:32 β π 2 π 0 π¬ 1 π 0
I recently started reading about iceberg. Would using something like slatedb with a custom schema/convention for storing the catalog info work?
26.11.2024 06:48 β π 2 π 0 π¬ 1 π 0
Mostly posts about PostgreSQL, Snowflake Postgres, and PostgreSQL extensions.
Formerly Crunchy Data, Microsoft, Citus Data, AWS, TCD, VU
http://github.com/frankmcsherry/blog
A programming language empowering everyone to build reliable and efficient software.
Website: https://rust-lang.org/
Blog: https://blog.rust-lang.org/
Mastodon: https://social.rust-lang.org/@rust
Co-founder arroyo.dev, building next-gen streaming systems. Prev Splunk, Lyft, Sift, Quantcast.
search every byte π {vector, full-text} search engine built from first principles on object storage. 10x cheaper, scales to 100B. powers Notion, Cursor, Linear
The global home for open source software, powering some of the worldβs most ubiquitous software projects in web, big data, Java, IoT, cloud computing, and more. Learn more at https://apache.org.
The unoffical Apache Kafka Streams account. Long live the otter.
Serverless, databases, and serverless databases at AWS. Views my own.
Check out my blog: https://brooker.co.za/blog/
Committer & PMC member @ Apache Kafka
Software developer @ Responsive
Convinced otter π¦¦
The Proceedings of the VLDB Endowment (PVLDB)
https://vldb.org/pvldb/
RSS Feed: https://db.cs.cmu.edu/files/rss/pvldb-rss.xml
Automated by @andypavlo.bsky.social
Building distributed systems and data infra.
Previously co-creator of Apache Flink (https://flink.apache.org/),
now building Restate (https://restate.dev/) to make distributed apps more easily resilient and scalable.
Systems engineer @turbopuffer.bsky.social. Former CTO @materialize.com.
CEO @ feldera.com, the incremental compute engine for AI, ML and data teams.
Formerly a systems researcher in distributed systems, databases, cloud, OS, PL, and networking. Sci-fi and gaming nerd.
lalith.in/research
maintainer of SlateDB
loves Rust, Datasys, Cloud Infra, AI
https://flaneur2020.github.io
breaking databases @tur.so W1 '21 @recursecenter.bsky.social
excited about databases, storage engines and message queues
ceo & cofounder of turbopuffer.com. prev infra @Shopify π©π°->π¨π¦
Write at lethain.com. Author of An Elegant Puzzle, Staff Engineer, and An Engineering Executiveβs Primer. Worked some places.