@lidavidm.bsky.social
PMC member for Apache Arrow.
So I wonder if the term was already floating around the database community, and Julien (or someone else) (unintentionally?) swapped "striping" for "shredding" in the Parquet docs, and then the term took hold as Parquet became popular.
18.05.2025 15:26 โ ๐ 0 ๐ 0 ๐ฌ 0 ๐ 0The docs for SQL Server 2000 here talk about shredding
> OPENXML calls can be used to provide rowset view...and process them, for example, inserting them into different tables (this process is also referred to as "Shredding XML into tables")
www.microsoft.com/en-nz/downlo...
This page, supposedly from 2003, talks about SQL Server 2000 adding a function to "shred" XML
> Microsoft SQL Server 2000 also provides the OPENXML function to shred an XML document and provide a rowset representation of the XML data.
web.archive.org/web/20120115...
It seems the first mention in the Parquet repos is from 2013, though. There Julien Le Dem links to a page about "striping" (as used in the Dremel paper) but calls it "shredding". So maybe you should ask him directly :)
github.com/apache/parqu...
I went into a rabbit hole on "record shredding"...Here's something interesting: there's an SO question from 2008 asking about "shredding XML data into relational tables". Maybe the term sort of already existed? stackoverflow.com/questions/61...
18.05.2025 15:18 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0I will not trust macOS with external drives ever again. "First Aid" seems to have deleted directories from my local backup SSD...thankfully I have another backup in Backblaze but it's a bit older :/
Good (and painful) reminder to back up regularly!
@yatosaking.bsky.social ๆงใ @galetteweb.bsky.social ๆง
ๅคฑ็คผใใใใพใใ
่ฒ็ดใๅฟใใๆ่ฌใใฆใใใพใใ
๏ผใญใฉใญใฉใใฆใใใจใใ็นใซใๆฐใซๅ
ฅใใพใ๏ผ
ใใใใใๅ
็ใจใฌใฌใใใๅฟๆดใใใใพใ๏ผ
ๆฅๆฌใญโฆใใใฐใในใผใใผใซๆใฃใฆ่กใฃใฆใใๅบๅกใใใฏๅ่ฒทใ็ฉใไธๅฏงใซใใใผใซ่ขใใคใใฆใใใ๐
19.04.2025 08:46 โ ๐ 0 ๐ 0 ๐ฌ 0 ๐ 0ไปๆฅTimeLeftใขใใชใฎใใผใใขใใใใใฃใฆใใ็งใฏไผ่ฉฑ่ฝๅ(?)ใใใพใใชใใฆใใใฃใๆฅฝใใใฃใใจๆใ
ใใใใ้ ๅผตใใชใใ
PRใฌใใฅใผใฎใใใซPostgresใจๆฏ่ผใใใใจๆใฃใฆใใใฉโฆใใใฏใใใฃ๐ฑ
15.04.2025 05:53 โ ๐ 0 ๐ 0 ๐ฌ 0 ๐ 0Baby blue eyes (nemphila) flowers in Showa Kinen Park
The favorite flower of Himmel the Hero.
Showa Kinen Park, Tachikawa, Tokyo
Olympus E-M10 Mk2/TTArtisans 35mm f/1.4
Spotted in discord (and paraphrased to protect the innocent):
"I don't particularly like actually doing the job, but thinking about it? Hoo boy"
(Someday I'll make that teaching implementation of Arrow...)
The sakura are not _quite_ there, but almost!
26.03.2025 05:17 โ ๐ 2 ๐ 0 ๐ฌ 0 ๐ 0Kou has a blog post too: x.com/ktou/status/...
21.03.2025 06:41 โ ๐ 1 ๐ 0 ๐ฌ 0 ๐ 0April is a great time to visit :)
21.03.2025 06:36 โ ๐ 1 ๐ 0 ๐ฌ 0 ๐ 0Anyone want to book a last minute trip to Japan? Kou, Rok, and I will be there :)
red-data-tools.connpass.com/event/349680/
I'm biased but maybe things like Apache Parquet, Apache Arrow? They have multiple implementations across different languages and Arrow gets used as a means of interchange between different data vendors (Spark, BigQuery, ClickHouse <-> Pandas, polars, etc.)
19.03.2025 00:00 โ ๐ 1 ๐ 0 ๐ฌ 1 ๐ 0Check out what is new on the Apache Arrow ADBC 17 libraries release: arrow.apache.org/blog/2025/03...
07.03.2025 11:12 โ ๐ 5 ๐ 2 ๐ฌ 0 ๐ 0Data wants to be free: comparing and explaining how Arrow's data serialization can be better than what's in protocols like PostgreSQL's
arrow.apache.org/blog/2025/02...
#apachearrow #arrow
2025 is shaping up to be a breakout year for fast query result transfer with Apache Arrow. But what exactly makes it so fast? David Li, Matt Topol, and I break it down in this new blog post: arrow.apache.org/blog/2025/01...
13.01.2025 16:25 โ ๐ 21 ๐ 9 ๐ฌ 0 ๐ 2