Dmitriy Ryaboy's Avatar

Dmitriy Ryaboy

@squarecog.bsky.social

Works with data, runs with swords.

2,806 Followers  |  230 Following  |  172 Posts  |  Joined: 26.10.2024  |  1.8909

Latest posts by squarecog.bsky.social on Bluesky

Right, it's all about the ecosystem. Writers are always going to be more conservative than readers, rightfully so. This f3 idea is essentially about letting writers adopt new stuff without worrying too much about older gen readers (once older gen can read this sort of thing, another decade later).

08.10.2025 21:42 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

bsky.app/profile/andy...

08.10.2025 16:15 β€” πŸ‘ 0    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0

But anyway the point is not whether rle is useful, but if there is a world where parquet format improvements introduced since like 2018 get adopted, and more useful encodings can be propagated.

08.10.2025 15:42 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

RLE+delta allows filter pushdowns to work without decompressing. If you have repeated strings and sort, dict encode, and rle+delta, even regex searches become blazing fast. Parquet enables this, but who implements it?

08.10.2025 15:40 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

To be fair, you would not require it. An implementor would only do this if they want to future proof, and are ok with the whole executable data file thing. Otherwise, same as now: implement the reader for every encoding.
It's painful how little even basic RLE is being used in the wild :(

07.10.2025 20:32 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

I had the same 2 thoughts in the same sequence :)

01.10.2025 23:52 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Do you think there's anything blocking parquet from adopting the same wasm reader approach to unlock new encodings and other schemes?

01.10.2025 23:24 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

This is a pretty intriguing idea for future proofing file formats.
It does assume wasm is future proof, of course, but that feels like a safer bet than "assume readers are updated"

01.10.2025 15:23 β€” πŸ‘ 5    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

If you love this sort of thing, read up on C-store, which introduced this idea in 2005 and commercialized it in Vertica. Stonebraker, Sam Madden, Daniel Abadi.
Parquet was also partially inspired by Vertica (and Google's Dremel, and PaX by Natassa Ailamaki et al) :-).

30.09.2025 15:04 β€” πŸ‘ 4    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

The original Yo app.

26.09.2025 11:01 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

ML is just applied stats.
Stats is just applied algebra.
LLM is just ML backward and with an extra L.

22.09.2025 22:50 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

The obvious reaction here is to shift at least some of the hiring out of the country to get access to the talent. The obvious counter reaction is to tax payments and wages to foreign employees and contractors. Which will also provoke a reaction. And none of this makes the US stronger or smarter.

20.09.2025 21:17 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

About a decade late with this, but:
Someone should have started a social media ad agency called Twaddle.

20.09.2025 18:25 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Ask Ketan, I've been trying to find a good excuse to get my teams to use Flyte for half a decade now πŸ˜†

20.09.2025 18:24 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

It's tempting to take shortcuts that give you speed today by mortgaging speed tomorrow.

Trouble is, today is yesterday's tomorrow.

18.09.2025 23:07 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Thanks for the reference, hadn't seen that!
Are these all one-shotting or doing an agentic workflow to explore before formulating final answer?

17.09.2025 15:07 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

I tried 2 different english to insights sql llm agents from reputable vendors in the past week. Data analyst jobs remain safe.
Firmly in the toy category for now.

17.09.2025 00:25 β€” πŸ‘ 13    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0

Happened to be by the Cloudera building in the south bay earlier. Checked LinkedIn and discovered I have literally 0 1st degree connections who work there now. Not unexpected, I guess, but, man... betwen hnwx and cldr I used to know like 100s of folks there

16.09.2025 05:14 β€” πŸ‘ 5    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
OpenConnect: LinkedIn's Next-Generation AI Pipeline Ecosystem

Heck of a vote of confidence from LinkedIn for Flyte: www.linkedin.com/blog/enginee...

15.09.2025 03:02 β€” πŸ‘ 4    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0

Also, everyone trips at least once on average of ratios vs ratio of sums (which becomes obvious once you describe them as unweighted vs weighted means).

11.09.2025 20:52 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

It shows up in different ways in different places. The most basic being, you don't know if the rario moved cause numerator went up or denominator went down. Correct course of action is often different depending on which!

11.09.2025 20:47 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

About once every two years I have cause to re-learn a very important data lesson: never, ever, trust analysis based on ratio metrics.

10.09.2025 21:09 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Trying and failing to make the page edits look right? Is offline access lackluster? Tired of AI upsell as a replacement for poor search quality?

You might be suffering from Notion sickness.

30.08.2025 04:13 β€” πŸ‘ 5    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Phone book, noun: an ebook you read on your phone.

26.08.2025 01:11 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Looking up latin phrases on Google results in an AI response in French a good % of the time. Fortunately, my French is slightly better than my latin.

14.08.2025 15:15 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

The scariest thing about dinosaurs is that they were huge, absolutely dominant, *and humans had nothing to do with them dying out*

13.08.2025 01:03 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

"That means employees who bought Philz stock will have their stock canceled, rendering it valueless"

Not even getting strike price back. And it's not like they are getting tech salaries. Rough.

03.08.2025 22:10 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Andor is a very boolean show.

19.07.2025 05:12 β€” πŸ‘ 8    πŸ” 0    πŸ’¬ 3    πŸ“Œ 0
Preview
a close up of a woman 's face with the words two by two hands of blue Alt: Scene from Serenity: two by two hands of blue
10.07.2025 04:06 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Just 45 minutes north of SF, and AI means something completely different on a dairy farm in Sonoma, in the context of breeding livestock. I seriously wasn't tracking for a few minutes there.

10.07.2025 00:29 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

@squarecog is following 20 prominent accounts