's Avatar

@xevix.bsky.social

Software Developer interested in data, web, languages. Silicon Valley/Tokyo. https://medium.com/@xevix https://github.com/xevix

74 Followers  |  33 Following  |  122 Posts  |  Joined: 02.01.2024  |  1.7023

Latest posts by xevix.bsky.social on Bluesky

How Nvidia GPUs Compare To Google’s And Amazon’s AI Chips
YouTube video by CNBC How Nvidia GPUs Compare To Google’s And Amazon’s AI Chips

Good non-technical summary of some of the various chips out there. www.youtube.com/watch?v=RBmO...

08.02.2026 18:11 — 👍 0    🔁 0    💬 0    📌 0

Yeah, that or the opposite where it's biased toward a tool because of training data. I think you can suggest a tool in the global CLAUDEmd file but would be more seamless from cowork plugin.

07.02.2026 19:44 — 👍 0    🔁 0    💬 0    📌 0
Post image

Claude cowork w/ Opus 4.6 is definitely smart, but got stuck on a data task, I stopped it, pointed it to DuckDB, done instantly. LLMs still have much to learn 🤔

07.02.2026 09:38 — 👍 0    🔁 0    💬 1    📌 0
The $200M Machine that Prints Microchips:  The EUV Photolithography System
YouTube video by Branch Education The $200M Machine that Prints Microchips: The EUV Photolithography System

Great overview of chip manufacturing with EUV. We really do have supercomputers in our pockets, on our desk and on our wrists. www.youtube.com/watch?v=B248...

07.02.2026 01:42 — 👍 0    🔁 0    💬 0    📌 0
Post image

🎞️ The slide decks and talk recordings of last Friday's developer meeting are out! duckdb.org/events/2026/...

02.02.2026 16:08 — 👍 23    🔁 6    💬 0    📌 0
Post image

Great talks at South Bay Systems hosted at databricks on xNVMe, fast SSD query processing, and using NPUs for DB work. Much work using DuckDB extensions. Need for async I/O as bottleneck a common topic, mainly at larger scale. luma.com/8a54z94d?tk=...

23.01.2026 00:36 — 👍 1    🔁 0    💬 0    📌 0
Preview
NVIDIA CUDA-X Powers the New Sirius GPU Engine for DuckDB, Setting ClickBench Records | NVIDIA Technical Blog Sirius, an open-source GPU native SQL engine, achieved a new performance record on Clickbench—a widely used analytics benchmark. Developed by University of Wisconsin-Madison with support from NVIDIA…

GPU-powered analytical query engines going mainstream? Needs nVidia GPU and limited to memory for now, but neat use of Substrait+Arrow for interop. DuckDB still easier to run anywhere, but this is useful for acceleration if needed.
developer.nvidia.com/blog/nvidia-...

31.12.2025 19:29 — 👍 2    🔁 0    💬 0    📌 1

How are the compile times?

18.11.2025 07:39 — 👍 0    🔁 0    💬 1    📌 0
KEYNOTE: Hannes Mühleisen - Data Architecture Turned Upside Down | PyData Amsterdam 2025
YouTube video by PyData KEYNOTE: Hannes Mühleisen - Data Architecture Turned Upside Down | PyData Amsterdam 2025

The PyData Amsterdam 2025 keynote “Minus Three Tier: Data Architecture Turned Upside Down” by @hannes.muehleisen.org is out now.

www.youtube.com/watch?v=DxwD...

31.10.2025 14:05 — 👍 25    🔁 4    💬 1    📌 1
SQL Arena Planner Ranking (November 2025)

SQL Arena Planner Ranking (November 2025)

New database leaderboard from Yellowbrick ranks the quality of DBMS optimizer estimates and plans. They only evaluate TPC-H for now and report results for Postgres + DuckDB + MSSQL: sql-arena.com/components/p...
Repo: github.com/sql-arena/db...
LinkedIn Group: www.linkedin.com/groups/15775...

03.11.2025 17:06 — 👍 14    🔁 3    💬 1    📌 0
Preview
[Future Data] Where We're Going, We Don't Need Rows: Columnar Data Connectivity with ADBC - Carnegie Mellon Database Group ADBC (Arrow Database Connectivity) is Apache Arrow’s answer to ODBC and JDBC:... Read More +

Today's Future Data Systems Seminar Speaker: Ian Cook (@ian.columnar.tech) will present @columnar.tech's work on Apache Arrow's database connectivity API (ADBC). ADBC is available in modern DBMSs. Zoom talk open to public at 4:30pm ET. YouTube video available after: db.cs.cmu.edu/events/futur...

20.10.2025 11:38 — 👍 15    🔁 8    💬 0    📌 1
Preview
[Future Data] Vortex: LLVM for File Formats - Carnegie Mellon Database Group Apache Parquet revolutionized columnar storage after its initial release in 2013, but... Read More +

Today's Future Data Systems Seminar Speaker: Will Manning (@willmanning.com) will present @spiraldb.com's Vortex file format. Vortex is now a @linuxfoundation.org project. Zoom talk open to public at 4:30pm ET. YouTube video available after: db.cs.cmu.edu/events/futur...

13.10.2025 11:10 — 👍 4    🔁 4    💬 0    📌 0
Benchmark Results for DuckDB v1.4 LTS DuckDB v1.4 LTS is both fast and scalable. In in-memory mode, it is the fastest system on ClickBench. In disk-based mode, it can run complex analytical queries on a dataset equivalent to 100 TB CSV fi...

Processing 100Tb of CSV files on a single machine is insane, little over 1hr per query, even if on a powerful AWS instance. Question heavily the need for complex systems when this is what’s possible now. Can’t wait for full write-up. Incredible work.

duckdb.org/2025/10/09/b...

10.10.2025 14:12 — 👍 8    🔁 1    💬 0    📌 0

It’s interesting the tradeoffs if the main goal is no operating cost and decent startup time. Definitely painful to develop on regularly but for a one and done this makes a lot of sense. I wonder if Rust compile times will come down further one day.

04.10.2025 22:21 — 👍 1    🔁 0    💬 0    📌 0
Post image

Taking the DuckDb hoodie on a trip. Not exactly Amsterdam but I’ve heard they like columnar databases here too.

04.10.2025 12:06 — 👍 3    🔁 0    💬 0    📌 0
Preview
Push Hive filtering into Glob() by xevix · Pull Request #18518 · duckdb/duckdb Summary Addresses part of #7620 for local filesystem. Part 1 of the work split off from the original PR #18430. The next part will handle fallback to eager loading in case of Hive issues. Push down...

I didn’t quite make it in time for Hive filtering lazy list to speed up filtering Hive folder with many partitions, but will pick up again before next release w/ luck 🙇‍♂️ github.com/duckdb/duckd...

16.09.2025 16:25 — 👍 2    🔁 0    💬 0    📌 0

Congrats to DuckDB team on LTS release w/ many great improvements! Hidden among them you can now use Hive filtering with read_blob, and SHOW TABLES FROM specific db w/o USE.

16.09.2025 16:25 — 👍 2    🔁 0    💬 1    📌 0
Post image

📈 DuckDB 1.4.0 is out! This is our first LTS release which comes with *one year of community support*. It also supports database encryption, the MERGE SQL statement and Iceberg writes.

For more details, read the announcement blog post at
duckdb.org/2025/09/16/a...

16.09.2025 11:55 — 👍 52    🔁 22    💬 0    📌 3
Preview
eBird in DuckDB I saw this post by the Clickhouse team which was doing a cool test of the eBird dataset from Cornell University, and wondered how DuckDB…

I tried loading eBird data (1.5B rows CSV ZIP) using DuckDB for fun, inspired by a Clickhouse blog post and a bit of curiosity. Both did well, DuckDB slightly faster querying and Parquet ingest, Clickhouse w/ native zip support, optimized for ingest and multitenancy. xevix.medium.com/ebird-in-duc...

02.09.2025 01:15 — 👍 3    🔁 0    💬 0    📌 0
Thumbnail: Saving Private Hash Join

Thumbnail: Saving Private Hash Join

Vol:18 No:8 → Saving Private Hash Join
👥 Authors: Laurens Kuiper, Paul Gross, Peter Boncz, Hannes Mühleisen
📄 PDF: https://www.vldb.org/pvldb/vol18/p2748-kuiper.pdf

03.08.2025 06:00 — 👍 14    🔁 4    💬 0    📌 1
Preview
Data Tool Component Sharing There are many partly overlapping tools in the data world, which is what inspired things like Calcite to have modular components for…

Is there too much duplicated effort in data tools? I sometimes wonder about this.

xevix.medium.com/data-tool-co...

29.08.2025 20:08 — 👍 0    🔁 0    💬 0    📌 0

Yeah, I don’t think MS is interested in 3rd party devs so much.

26.08.2025 16:45 — 👍 0    🔁 0    💬 0    📌 0

Unfortunately yes, I was already going to get one for something else and this put me over the edge. Maybe I'll also build a gaming rig one day in the distant future haha.

25.08.2025 22:12 — 👍 0    🔁 0    💬 1    📌 0
Post image

Compiling DuckDB on Windows 11 (ARM) using UTM VM on macOS to debug Windows compile issues. It's a shame msvc doesn't exist outside of Windows, mingw/clang don't work the same and cross-compiling is tricky. Compiling takes 5-10 mins (instead of 1-2 mins native), but it works 🎉!

25.08.2025 21:30 — 👍 3    🔁 0    💬 1    📌 0
Post image

Just the 1 day of data above is ~125GiB compressed, ~585GiB uncompressed. One month is about 3.75TiB compressed, or 17.5TiB. It makes sense this dataset is so popular for testing and analysis, wow.

15.08.2025 03:19 — 👍 6    🔁 0    💬 0    📌 0
Post image

Stretching DuckDB w/ Common Crawl, ~1.7B rows, ~300 parquet files. ~2-3s for single-column aggregations, ~2-3 mins to SUMMARIZE the data, peaking at ~12-14GB memory usage. Not exactly real-time, but the fact you can do this on a laptop with no server setups or Spark pipelines is still amazing.

15.08.2025 03:10 — 👍 44    🔁 9    💬 1    📌 1

Uploaded a simplified query. Had to delete and repost since no edit button on the snippets site, sorry for the spam haha.

13.08.2025 06:08 — 👍 1    🔁 0    💬 0    📌 0

Haha already posted in case someone benefits there too.

13.08.2025 05:13 — 👍 1    🔁 0    💬 1    📌 0
Post image

Neat little hack to get Hive partition list in DuckDB, useful for an overview. Might be neat to have built-in. gist.github.com/xevix/04f33d...

12.08.2025 20:14 — 👍 4    🔁 0    💬 1    📌 0
Post image Post image

Automator using simple shell script to call sqlfluff. Added keyboard shortcut for the service too. Easier than making browser extensions for each browser, although unfortunately not cross-platform.

29.06.2025 19:46 — 👍 0    🔁 0    💬 0    📌 0

@xevix is following 19 prominent accounts