TEGNicholas.bsky.social's Avatar

TEGNicholas.bsky.social

@tegnicholas.bsky.social

Open-Source Software for science at Earthmover.io, built on Pangeo.io. One of many xarray.dev core devs. https://tom-nicholas.com/ Previously dabbled in oceanography at [C]Worthy and Columbia Uni., originally did fusion plasma physics.

332 Followers  |  486 Following  |  72 Posts  |  Joined: 19.01.2024  |  2.5019

Latest posts by tegnicholas.bsky.social on Bluesky

Preview
Matt Yglesias Is Confidently Wrong About Everything The Biden administrationโ€™s favorite centrist pundit produces smug pseudo-analysis that cannot be considered serious thought. He ought to be permanently disregarded.

"[Matt Yglesias] can write whole essays claiming that fracking is good and we need fossil fuel friendly energy policies, dismissing progressives as childish, while never engaging with the scientific literature on the consequences of climate change"

03.09.2025 21:32 โ€” ๐Ÿ‘ 363    ๐Ÿ” 69    ๐Ÿ’ฌ 8    ๐Ÿ“Œ 7

This reads like the writings of climate denialists - constant strawmanning, shifting of goalposts, selection bias in research quoted, and conspiracy theorizing.

27.08.2025 00:09 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Preview
We Needed Better Cloud Storage for Python so We Built Obstore โ€” Development Seed Obstore solves the friction we kept hitting in cloud-native workflows.

New blog post on Obstore, fast, multi-provider cloud storage access for Python:

developmentseed.org/blog/2025-08...

04.08.2025 16:11 โ€” ๐Ÿ‘ 3    ๐Ÿ” 2    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Open source, open science for earth, climate and geospatial science? Coming to #AGU25? Build tools in #Python @jupyter.org?

Submit an abstract for this session and come meet us and like minded scientists!

24.07.2025 21:54 โ€” ๐Ÿ‘ 9    ๐Ÿ” 4    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Preview
He helped Microsoft build AI to help the climate. Then Microsoft sold it to Big Oil. A former Microsoft project manager reveals how the tech giant is using AI to help Big Oil drillโ€”and how he and his partner are now pushing for change.

Will Alpine co-wrote Microsoft's manifesto on how AI will be a powerful force for good for climate change

In a new interview, Alpine disavows the manifesto, saying he believes Microsoft used his work to distract from the much larger climate harms the company enables through contracts with Big Oil

21.07.2025 17:24 โ€” ๐Ÿ‘ 95    ๐Ÿ” 55    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 4

This is fucking insane. Closing these NOAA labs would obliterate our ability to observe, understand, and forecast the Earth System, from weather systems tomorrow to sea levels 50 years from now.

30.06.2025 16:59 โ€” ๐Ÿ‘ 140    ๐Ÿ” 75    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 4

NASA is being told to cancel 19 *active* missions to save $6B, which looks to be less than the ICE *hiring/retention* budget going forward.

I need people to let that sentence sink into their bones for a minute.

29.06.2025 16:17 โ€” ๐Ÿ‘ 4221    ๐Ÿ” 2128    ๐Ÿ’ฌ 73    ๐Ÿ“Œ 90

I've been adding new accounts to the Open Source Geospatial starter pack. Who else wants on or off?
#gischat #geosky

go.bsky.app/PGYLmPG

28.10.2024 19:57 โ€” ๐Ÿ‘ 70    ๐Ÿ” 22    ๐Ÿ’ฌ 35    ๐Ÿ“Œ 5

Oh @jsignell.github.io too!

22.05.2025 08:55 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

I would have suggested Max Jones, Aimee Barciauskas, or Lindsey Nield, but they don't seem to be on BlueSky, so you could instead add @jhamman.bsky.social , @rabernat.bsky.social , or myself.

22.05.2025 08:55 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Preview
Zarr takes Cloud-Native Geospatial by storm - Earthmover Our takeaways from the Cloud-Native Geospatial conference on Zarrโ€™s surging adoption and its impact on the future of Earth Observation data. Our team just returned from an action-packed week at the Cl...

There should be some @zarr.dev geospatial representation on here.

Evidence:

earthmover.io/blog/zarr-ta...

22.05.2025 08:55 โ€” ๐Ÿ‘ 1    ๐Ÿ” 1    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

It's outrageous that NASA GISS, one of the best earth & space science labs in the world, is being kicked out of its Columbia home. The outstanding scientists who work there can't say that publicly, but I can. And so can you --- call your reps, esp. (but not only) if you live in NYC or NY state.

21.05.2025 16:45 โ€” ๐Ÿ‘ 62    ๐Ÿ” 36    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Icechunk: Efficient storage of versioned array data - Earthmover We recently got an interesting question in Icechunkโ€™s community Slack channel (thank you Iury Simoes-Sousa for motivating this post): Iโ€™m new to Icechunk. How is the storage managed for redundant info...

๐ป๐‘œ๐‘ค ๐‘‘๐‘œ๐‘’๐‘  ๐ผ๐‘๐‘’๐‘โ„Ž๐‘ข๐‘›๐‘˜ ๐‘Ž๐‘ฃ๐‘œ๐‘–๐‘‘ ๐‘Ÿ๐‘’๐‘‘๐‘ข๐‘›๐‘‘๐‘Ž๐‘›๐‘ก ๐‘ ๐‘ก๐‘œ๐‘Ÿ๐‘Ž๐‘”๐‘’ ๐‘๐‘’๐‘ก๐‘ค๐‘’๐‘’๐‘› ๐‘‘๐‘Ž๐‘ก๐‘Ž ๐‘ฃ๐‘’๐‘Ÿ๐‘ ๐‘–๐‘œ๐‘›๐‘ ?

Icechunk stores only new or changed chunks for each version โ€”no redundant copies or rewrites. You get instant time travel, branching, and efficient updates, all with negligible storage overhead.

More: bit.ly/3F1XFST

14.05.2025 16:09 โ€” ๐Ÿ‘ 3    ๐Ÿ” 4    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Excellent post by Brian Davis laying out why doing "Open Science" for data-driven workflows is almost impossible in practice, at least without much better data pipeline tools.

12.05.2025 14:10 โ€” ๐Ÿ‘ 6    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

nice analogy ๐Ÿ˜‰

26.04.2025 22:52 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Preview
White House Proposal Could Gut Climate Modeling the World Depends On Potential funding cuts for NOAA and its research partners threaten irreparable harm not only to climate research but to American safety, competitiveness, and national security.

The proposed cuts to NOAA cold have profound consequences not just for climate change, but for our national security and the entire economy. Here's what I learned: www.propublica.org/article/trum...

24.04.2025 18:06 โ€” ๐Ÿ‘ 582    ๐Ÿ” 206    ๐Ÿ’ฌ 7    ๐Ÿ“Œ 5

It's fun to work with real hardcore software engineers like @functionth.bsky.social who can teach you about database consistency and transactions and all that

Scientific data infrastructure should be built on solid foundations like this instead of on piles of janky code written by postdocs...

23.04.2025 16:29 โ€” ๐Ÿ‘ 3    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

the fact that I've never once thought about making a range request, and yet make them constantly for extremely targeted data pulls, is absolutely an invisible technical miracle

19.04.2025 14:31 โ€” ๐Ÿ‘ 4    ๐Ÿ” 1    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

Of course- I really wrote this article for my past self! I wish someone had explained this cloud science stuff to me earlier.

17.04.2025 18:50 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Itโ€™s also important for understanding what problem VirtualiZarr solves.

Iโ€™ve given this explanation to many people in the past (including at @cworthy.bsky.social), so I hope that this article can serve as a useful reference the next time someone wonders what @zarr.dev actually is.

17.04.2025 17:50 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

I wrote the article I wish I could have read back when I first heard of Zarr and cloud-native science back in 2018.

This explains how object storage and conventional filesystems are different, and the key properties that make @zarr.dev work so well in cloud object storage.

17.04.2025 17:50 โ€” ๐Ÿ‘ 4    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

5/ Almost all organisations working with scientific array data have this kind of data delivery issue, even if it's just internally.

Whilst the Flux integrations today are established geospatial standards, you also see similar patterns in other fields such as Neuroscience.

16.04.2025 13:41 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

4/ Flux's architecture is auto-scaling, so once turned on there is no need to worry about how many users are hitting the data.

As it's not a stateful server like THREDDS, it won't catch fire under pressure.

This is what "Cloud-Native" architectures for scientific data look like.

16.04.2025 13:41 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

3/ Your downstream scientists, GIS users, analysts, and external users can all now forget about file formats!

They just keep using the same GUI or tool or script that they prefer, and don't need any other services or copies of the data made bespoke for them - Flux does that on-demand!

16.04.2025 13:41 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

2/ Flux bridges this chasm.

It sits in between your data and the consumers, springing up at a moment's notice to provide subsets of data however your users prefer it.

16.04.2025 13:41 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

1/ Flux solves the impedance mismatch between geospatial data providers and consumers.

Providers want to manage data lakes stored in cloud-optimized formats like Zarr, but consumers want their applications to keep being fed data in ways they already understand.

16.04.2025 13:41 โ€” ๐Ÿ‘ 3    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Preview
White House outlines plan to gut NOAA, smother climate research The agencyโ€™s Office of Oceanic and Atmospheric Research would be โ€œeliminated as a line office,โ€ according to a memo from the Office of Management and Budget.

Hard to overstate this plan's reach, which touches nearly every aspect of NOAA's work - dissolving its research arm, gutting climate science, diminishing sat observations, boosting fossil fuels. With amazing colleagues Daniel Cusick and @scottpwaldman.bsky.social :
www.politico.com/news/2025/04...

11.04.2025 18:38 โ€” ๐Ÿ‘ 59    ๐Ÿ” 39    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 4
Preview
0$ Data Distribution Ju Data Engineering Weekly - Ep 78

You could also do this for arbitrarily large scientific array datasets using Xarray + Icechunk + R2/Tigris

juhache.substack.com/p/0-data-dis...

10.04.2025 20:14 โ€” ๐Ÿ‘ 0    ๐Ÿ” 1    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Preview
Exploring Icechunk scalability: untangling S3's prefix story | Earthmover We show Icechunk can scale to extremely high concurrency levels, and explain how it achieves this in modern object stores.

๐Ÿ“ฃย Blog post alert! ๐„๐ฑ๐ฉ๐ฅ๐จ๐ซ๐ข๐ง๐  ๐ˆ๐œ๐ž๐œ๐ก๐ฎ๐ง๐ค ๐ฌ๐œ๐š๐ฅ๐š๐›๐ข๐ฅ๐ข๐ญ๐ฒ: ๐ฎ๐ง๐ญ๐š๐ง๐ ๐ฅ๐ข๐ง๐  ๐’๐Ÿ‘'๐ฌ ๐ฉ๐ซ๐ž๐Ÿ๐ข๐ฑ ๐ฌ๐ญ๐จ๐ซ๐ฒ. This technical post by @functionth.bsky.social dives deep into the internals of how S3 shards data, showing that distributed Icechunk can easily perform 230,000 object reads/sec and beyond. earthmover.io/blog/explori...

09.04.2025 15:27 โ€” ๐Ÿ‘ 5    ๐Ÿ” 4    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 3

Several times some database comp sci nerd has suggested to me that you could just do everything in array land using tabular database tools. Whilst they are technically correct that you _could_, this article convincingly shows why you _should not_ - that would be horribly inefficient.

03.04.2025 17:48 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

@tegnicholas is following 20 prominent accounts