VLDB 2026 | SponsorshipThe VLDB 2026 conference, will take place in Boston, MA, United States, from Aug 31st to Sep 4th, 2026, and will feature research talks, tutorials, demonstrations, and workshops...
In 2026, VLDB is returning to the Boston area 5 decades after it was born here (first VLDB was in Framingham). A good opportunity to get your company's name on the program (and earn the everlasting gratitiude of the organizing committee) vldb.org/2026/sponsor...
06.03.2026 11:03 β
π 3
π 0
π¬ 0
π 0
Yeah, shredding is a very clever optimization
27.02.2026 14:57 β
π 1
π 0
π¬ 0
π 0
Here is a new blog about Parquet Variant, including use case, and shredding examples
parquet.apache.org/blog/2026/02...
27.02.2026 14:21 β
π 7
π 2
π¬ 1
π 1
It came up on the Parquet sync today if anyone has practical experience with comparing FastLanes encoding vs "classic" bit packing (without the unified shuffled layouts). If you have would love to know your experience
25.02.2026 19:17 β
π 2
π 0
π¬ 0
π 0
I suggest getting comfortable with rm -rf every few days -- it works wonders for me :)
25.02.2026 19:17 β
π 3
π 0
π¬ 0
π 0
parquet-linter: A better Parquet is Parquet itself β Xiangpengβs blog
Unleash the performance potential of your Parquet files
Simply applying basic linting rules (like don't compress pages where it doesn't help) reduces parquet files sizes by 5% and decreases decode time by 20%.
@xiangpeng.systems shows how in his latest blog
blog.xiangpeng.systems/posts/parque...
23.02.2026 15:26 β
π 19
π 2
π¬ 0
π 0
Native Geospatial Types in Apache Parquet
Native Geospatial Types in Apache Parquet
Great inaugural post about the geospatial types on the Parquet blog.
Thank you Jia Yu, Dewey Dunnington , Kristin Cowalcijk, Feng Zhang.
More posts coming !
parquet.apache.org/blog/2026/02...
14.02.2026 00:36 β
π 8
π 2
π¬ 0
π 0
π Apache Parquet recently added native support for Geospatial. This post explains what that means and why it is important: parquet.apache.org/blog/2026/02...
13.02.2026 13:56 β
π 13
π 2
π¬ 0
π 0
You can use ApacheParquet for Vector Search with embedded indexes:
> We donβt change the file format; we just tune it.
Xiangpeng Hao explains how in blog.xiangpeng.systems/posts/vector...
10.02.2026 12:17 β
π 6
π 1
π¬ 0
π 0
The Quest for One Million IOPS: Benchmarking Storage at LanceDB
Learn how LanceDB benchmarks storage and how we achieved one million disk reads per second.
Different techniques are needed to max out modern NVMe SSDs.
@westonpace.bsky.social LanceDB blog is so good if you want the industrial version: lancedb.com/blog/one-mil...
Viktor Leis's LeanStore paper is great if you want the academic version: vldb.org/pvldb/vol16/...
07.02.2026 11:59 β
π 13
π 1
π¬ 0
π 0
A somewhat academic talk about the AI usecases driving changes in Apache Parquet and new formats in "Column Storage for the AI Era"
Recording: youtu.be/k9uhw7yqPsQ
Slides: docs.google.com/presentation...
03.02.2026 19:35 β
π 8
π 0
π¬ 0
π 0
What I really need is to focus more on reviews / getting stuff merged as now the coding is even easier π
02.02.2026 14:12 β
π 2
π 0
π¬ 0
π 0
Optimized implementation of SQL CASE expressions in column stores requires careful engineering. The latest Apache DataFusion blog from Pepijn Van Eeckhoudt and Raz Luvaton explains how it works
datafusion.apache.org/blog/2026/02...
02.02.2026 14:08 β
π 7
π 0
π¬ 0
π 0
One downside of tools like Codex is that it enables even more "side quests" -- I was already pretty bad at focusing, and now the ability to write the equivalent of a ticket and have some code to review in 10 minutes makes the problem far worse.
30.01.2026 11:29 β
π 7
π 0
π¬ 2
π 0
DataFusion 52 Release Blog is Published datafusion.apache.org/blog/2026/01...
28.01.2026 20:17 β
π 5
π 0
π¬ 0
π 0
I love it when I see a whole pile of commits I didn't review go to DataFusion main
github.com/apache/dataf...
27.01.2026 21:55 β
π 10
π 0
π¬ 0
π 0
Designing a Table Format for ML Workloads
Explore designing a table format for ML workloads with practical insights and expert guidance from the LanceDB team.
I have been working on a talk about the future of table formats, specifically what is needed for AI workloads, and I found Weston's blogs on LanceDB well written and super helpful: lancedb.com/blog/designi...
26.01.2026 11:51 β
π 7
π 1
π¬ 1
π 0
Meet the Speakers β TokioConf 2026
Discover the speakers behind TokioConf 2026. From core maintainers to community leaders, our lineup shares real-world experience and insights.
Weβre excited to share the complete list of speakers joining us at TokioConf 2026 covering performance tricks, architecture patterns, and more.
See all our speakers: www.tokioconf.com/speakers
(Schedule coming soon)
Tickets are on sale: www.eventbrite.com/e/tokioconf-...
09.01.2026 22:11 β
π 12
π 4
π¬ 0
π 1
DataFusion blog from Geoffrey Claude explains how to extend DataFusion to support:
-- Postgres style operators
SELECT payload->'user'->>'id'
FROM logs;
-- Statistical sampling
SELECT * FROM sensor_data
TABLESAMPLE BERNOULLI(10 PERCENT);
datafusion.apache.org/blog/2026/01...
14.01.2026 21:04 β
π 10
π 0
π¬ 0
π 0
Are You Sure You Want to Use MMAP in Your Database Management System?
MMAP Databases = π©
Since you don't seem to want to cite your own work, I will do it for you: db.cs.cmu.edu/mmap-cidr2022/
12.01.2026 13:29 β
π 4
π 0
π¬ 0
π 0
Stoked to be attending North East Database Day on Friday. It is a great mini conference and highlights some of the great research work going on in this area nedbday.github.io/2026/
12.01.2026 13:28 β
π 4
π 1
π¬ 0
π 0
Great paper about pruning from Snowflake: arxiv.org/pdf/2504.11540
The LIMIT pruning they describe is π€― (so clever once you get it)
We have implemented almost all of the techniques in Apache DataFusion, FWIW
08.01.2026 16:17 β
π 13
π 1
π¬ 0
π 0
Come meet fellow Apache DataFusion users and committers at the Stockholm meetup March 5, luma.com/ctqtiqap
07.01.2026 21:05 β
π 4
π 0
π¬ 0
π 0
Latest Apache DataFusion blog: more efficient plans and how to efficiently contribute: datafusion.apache.org/blog/output/...
20.12.2025 12:37 β
π 10
π 1
π¬ 0
π 1
Qiwei Huang explains how we use Late Materialization (LM) in the Apache Rust Parquet reader to accelerate filtering. LM can describe several techniques, but this is a core one (also applies to joins, Top-K, etc)
arrow.apache.org/blog/2025/12...
12.12.2025 11:40 β
π 10
π 1
π¬ 0
π 0