I wonder if anyone has done a cost analysis of Python code running in the wild compared to a language that actually is performant. I suspect companies are needlessly spending millions because of lazy developers.
17.10.2025 17:54 β π 0 π 0 π¬ 0 π 0
Not me, a different Bruce RItchie
01.10.2025 17:45 β π 0 π 0 π¬ 0 π 0
F3: The Open-Source Data File Format for the Future
SIGMOD 2025
Our SIGMOD paper with our friends at Tsinghua + @wesmckinney.com + @pateljm.bsky.social on creating a next generation open-source data file format is out. F3 is a future-proof file format avoids the mistakes of Parquet.
π Paper: db.cs.cmu.edu/papers/2025/...
π Code: github.com/future-file-...
01.10.2025 13:49 β π 67 π 21 π¬ 4 π 5
ashtom.github.io/developers-r... ... so much absurdity in this it's crazy. Never trust a damn thing from someone whose job depends on selling you something.
06.08.2025 18:38 β π 0 π 0 π¬ 0 π 0
Apache DataFusion 49.0.0 Released - Apache DataFusion Blog
@apachedatafusion.bsky.social 49.0.0 released. Async UDF's, Parquet modular encryption, WITHIN GROUP support, Dynamic Filters and TopK pushdown and much more ... datafusion.apache.org/blog/2025/07...
29.07.2025 21:04 β π 0 π 0 π¬ 0 π 0
Medium has turned into a wasteland of AI generated or AI augmented posts. I'd say less than 25% of the daily digest highlights are actual 'real' articles. Sad.
10.07.2025 14:01 β π 0 π 0 π¬ 0 π 0
A 200 Ok response from S3 ... isn't always ok. Way to go AWS for making your service horrendous to support. repost.aws/knowledge-ce...
30.05.2025 19:56 β π 0 π 0 π¬ 0 π 0
I am unsure whether Google Summer of Code is a benefit or a hindrance to an open source project. Time will tell I suppose by the PR's submitted.
08.04.2025 16:48 β π 0 π 0 π¬ 0 π 0
It's been well over a year since I started the process of rewriting a large and very long running job from Apache Spark/Scala to Apache DataFusion/Rust. We're now well into doing poc's to rewrite a few other expensive jobs the same way. It's a very nice feeling.
03.04.2025 21:08 β π 0 π 0 π¬ 0 π 0
This one was going around the office today and made me chuckle :)
28.03.2025 13:30 β π 0 π 0 π¬ 0 π 0
Not by me, I'm not in Florida, nor in the US.
27.02.2025 15:53 β π 0 π 0 π¬ 0 π 0
coworker in chat: "... cluster is rebalancing and I'm trying to get the jello to stop shaking". Best explanation of rebalancing I've heard in a long time π€£
27.02.2025 14:43 β π 0 π 0 π¬ 0 π 0
Thank you Doug Ford for the $200 vote bribe. I'll use it to contribute to another party and vote to get your kind out of office.
09.02.2025 14:20 β π 0 π 0 π¬ 0 π 0
Had a good chuckle this morning. Gemini was enabled on company corporate accounts and lasted all of 2 days before it was disabled.
31.01.2025 13:58 β π 0 π 0 π¬ 0 π 0
The latest paper from the #1 CMU-DB PhD student @samarchdb.bsky.social is wild compilation magic! He automatically makes UDFs run 300x faster on SQL Server and 1.3x faster on DuckDB.
Code: github.com/SamArch27/PR...
Paper: www.vldb.org/pvldb/vol18/...
06.12.2024 14:56 β π 53 π 10 π¬ 2 π 1
Working in Rust for the last year has really made me aware of just how useful some features in other languages really are.
- variadic functions
- Default values for arguments
- Named arguments
- Enum variants as types
Rust is getting if let chains in the 2024 edition though so that is something.
26.11.2024 15:46 β π 3 π 0 π¬ 2 π 0
Lately there are two things I've been wishing that #Rust had: variadic functions and enum variants as types. Using a builder or macro to work around the first is just that, a workaround. Having the second would make some things much nicer
19.11.2024 20:32 β π 2 π 0 π¬ 0 π 0
64GB of ram is not enough any more.
17.11.2024 17:45 β π 0 π 0 π¬ 0 π 0
Datafusion v43 has seen a lot of performance work especially around reading parquet and the numbers are very nice! From the clickbench benchmark on the same hardware type:
15.11.2024 16:17 β π 0 π 0 π¬ 0 π 0
Software Engineering nerd. UMich PhD Physicist. US Navy veteran. Former Director, Camp Quest Michigan.
Chaotic neutral in thoughts, lawful good in actions.
Will run for pizza.
Unofficial bot posting what's new on AWS, from https://aws.amazon.com/new.
Co-Founder @ versd.co / Software Engineer @ Polygon.io
Apache {DataFusion PMC}, Database Internals
This page for educational purpose only.Dm for credit removal.
Kunta, Geordi, Reading Rainbow guy. Flies twice as high⦠#bydhttmwfi
Cofounder & CTO InfluxData, makers of InfluxDB, the open source time series database. Founder of NYC Machine Learning Meetup. Former Ruby on Rails developer and enthusiast (still a fan).
Writer of books (http://themissingreadme.com), code (http://slatedb.io), checks (http://materializedview.capital), and newsletters (http://materializedview.io)
Apache Arrow & DataFusion PMC Member. Original creator of Apache DataFusion.
software engineer at ClickHouse; prev: Figma, WePay
long-form βοΈ: https://expertofobsolescence.substack.com
Principal Engineer, Founder, Angel, Advisor, OSS.
LFAI&data: OpenLineage, Marquez, ASF: Parquet, Arrow, Iceberg, π
he/him.
Me: https://julien.ledem.net/
Blog: https://sympathetic.ink
Founder of SaaS Developer Community and Nile Database.
Making streams serverless @s2.dev
Forward Deployed Angel Investor
https://github.com/jwills
Software engineer. Data infra guy. Apache Committer/PMC on various projects. WebAssembly enthusiast. Also cinema, street photography and sarcasm...
Associate Professor at @cst.cam.ac.uk, researching decentralised systems and security protocols. Advisor to the Bluesky team. Wrote βDesigning Data-Intensive Applicationsβ (OβReilly). he/him
Co-founder and CTO of Oxide Computer Company. According to Field of Schemes, "tech exec and Oakland A's fan" -- but more of an Oakland Ballers fan now.
I build jamsocket.com and write digest.browsertech.com
I like databases and boats. Co-creator of @duckdb.org, Co-Founder and CEO DuckDB Labs. Professor of Data Engineering at Radboud Universiteit.
Code | Test | Deploy | Get paged
Data streaming, Machine Learning and AI app intensive Systems