Sebastian Galkin's Avatar

Sebastian Galkin

@functionth.bsky.social

20 Followers  |  34 Following  |  6 Posts  |  Joined: 23.11.2024  |  1.7038

Latest posts by functionth.bsky.social on Bluesky

So happy with this milestone. Lots of work went into this one!

10.07.2025 19:18 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Preview
Fundamentals: What Is Zarr? A Cloud-Native Format for Tensor Data - Earthmover What Zarr is, and how it enables fast, scalable access to multidimensional array data in the cloud.

Our latest fundamentals blog post provides an overview of @zarr.dev and its open-source ecosystem. Read more: earthmover.io/blog/what-is...

20.05.2025 15:06 โ€” ๐Ÿ‘ 10    ๐Ÿ” 3    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Icechunk: Efficient storage of versioned array data - Earthmover We recently got an interesting question in Icechunkโ€™s community Slack channel (thank you Iury Simoes-Sousa for motivating this post): Iโ€™m new to Icechunk. How is the storage managed for redundant info...

๐ป๐‘œ๐‘ค ๐‘‘๐‘œ๐‘’๐‘  ๐ผ๐‘๐‘’๐‘โ„Ž๐‘ข๐‘›๐‘˜ ๐‘Ž๐‘ฃ๐‘œ๐‘–๐‘‘ ๐‘Ÿ๐‘’๐‘‘๐‘ข๐‘›๐‘‘๐‘Ž๐‘›๐‘ก ๐‘ ๐‘ก๐‘œ๐‘Ÿ๐‘Ž๐‘”๐‘’ ๐‘๐‘’๐‘ก๐‘ค๐‘’๐‘’๐‘› ๐‘‘๐‘Ž๐‘ก๐‘Ž ๐‘ฃ๐‘’๐‘Ÿ๐‘ ๐‘–๐‘œ๐‘›๐‘ ?

Icechunk stores only new or changed chunks for each version โ€”no redundant copies or rewrites. You get instant time travel, branching, and efficient updates, all with negligible storage overhead.

More: bit.ly/3F1XFST

14.05.2025 16:09 โ€” ๐Ÿ‘ 3    ๐Ÿ” 4    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Preview
TensorOps: Scientific Data Doesn't Have to Hurt - Earthmover Curious how your team scores on the "Data Pain Survey"? Wondering why your teams are building Rube Goldberg machines just to put some data on a map? Or just want to see our plan to bring order to your...

Our latest blog post dives into the chaos of the status quo - where every tweak means regeneratingย the ๐‘คโ„Ž๐‘œ๐‘™๐‘’ ๐‘‘๐‘Ž๐‘ก๐‘Ž๐‘ ๐‘’๐‘กย and collaboration and experimentation is often stifled by silos and secret knowledge. Check out the full post: earthmover.io/blog/tensoro...

12.05.2025 13:09 โ€” ๐Ÿ‘ 3    ๐Ÿ” 3    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 1

After months of Rust, I wrote some Python this weekend. I immediately got burned by global mutable state

05.05.2025 03:39 โ€” ๐Ÿ‘ 7    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
CISL Seminar: Deepak Cherian (Earthmover)
YouTube video by NCAR Computational and Information Systems Laboratory (CISL) CISL Seminar: Deepak Cherian (Earthmover)

Last week @deepakcherian.bsky.social gave a fascinating talk at NCAR on data sharing and open-data. The historic perspective, the achievements and failures past and present, how to learn and move forward to fulfill the promises. Remarkable and illuminating www.youtube.com/watch?v=JZT3...

29.04.2025 20:01 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Had the idea of using Icechunk (an multi-dimensional array database) for something I would never use Icechunk for

23.04.2025 15:37 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Preview
Fundamentals: What is Cloud-Optimized Scientific Data? What cloud-optimized data really means, and how Zarr and Icechunk enable fast access to massive scientific datasets in cloud object storage.

1/ ๐Ÿ’กย Our latest blog post in the fundamentals series, written by @tegnicholas.bsky.social, demystifies cloud-optimized scientific data formats!

Read more: earthmover.io/blog/fundame...

17.04.2025 17:21 โ€” ๐Ÿ‘ 16    ๐Ÿ” 9    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 3
Preview
0$ Data Distribution Ju Data Engineering Weekly - Ep 78

You could also do this for arbitrarily large scientific array datasets using Xarray + Icechunk + R2/Tigris

juhache.substack.com/p/0-data-dis...

10.04.2025 20:14 โ€” ๐Ÿ‘ 0    ๐Ÿ” 1    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

230k reads/sec or much more. The S3ky is the limit!

09.04.2025 15:35 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Preview
Exploring Icechunk scalability: untangling S3's prefix story | Earthmover We show Icechunk can scale to extremely high concurrency levels, and explain how it achieves this in modern object stores.

๐Ÿ“ฃย Blog post alert! ๐„๐ฑ๐ฉ๐ฅ๐จ๐ซ๐ข๐ง๐  ๐ˆ๐œ๐ž๐œ๐ก๐ฎ๐ง๐ค ๐ฌ๐œ๐š๐ฅ๐š๐›๐ข๐ฅ๐ข๐ญ๐ฒ: ๐ฎ๐ง๐ญ๐š๐ง๐ ๐ฅ๐ข๐ง๐  ๐’๐Ÿ‘'๐ฌ ๐ฉ๐ซ๐ž๐Ÿ๐ข๐ฑ ๐ฌ๐ญ๐จ๐ซ๐ฒ. This technical post by @functionth.bsky.social dives deep into the internals of how S3 shards data, showing that distributed Icechunk can easily perform 230,000 object reads/sec and beyond. earthmover.io/blog/explori...

09.04.2025 15:27 โ€” ๐Ÿ‘ 5    ๐Ÿ” 4    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 3

We often see folks try to convince tabular data tools to perform well with multi-dimensional array data. This post by @rabernat.bsky.social explains, from first principles, why this rarely works. Its a good one! ๐Ÿ‘‡๐Ÿ‘‡๐Ÿ‘‡

03.04.2025 21:10 โ€” ๐Ÿ‘ 3    ๐Ÿ” 1    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

I've worked on Icechunk almost exclusively for the last six months. I'm very proud of the result; you should check it out.

28.03.2025 14:19 โ€” ๐Ÿ‘ 3    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Preview
Accelerating Xarray with Zarr-Python 3 | Earthmover We have recently dramatically improved the performance of Xarrayโ€™s Zarr backend. This post explores how weโ€™ve improved the โ€œtime to first byteโ€ metric, building on Zarr-Pythonโ€™s new asyncio internals.

1/ Check out our latest blog post earthmover.io/blog/xarray-... to learn about the dramatic improvement and performance of Xarrayโ€™s Zarr backend. We achieved improved the โ€œtime to first byteโ€ metric, building on Zarr-Pythonโ€™s new asyncio internals.

20.02.2025 15:33 โ€” ๐Ÿ‘ 4    ๐Ÿ” 3    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 4

@functionth is following 20 prominent accounts