's Avatar

@saganite.bsky.social

llm tinkerer. entropy cowboy. iconoclast.

165 Followers  |  67 Following  |  209 Posts  |  Joined: 30.06.2024  |  1.7966

Latest posts by saganite.bsky.social on Bluesky

ourworldindata.org/grapher/fert...

05.11.2025 10:59 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Imagine if you were really hungry, but you could perfectly satisfy your hunger with as much glorious feasting as you wanted, but with this special type of feasting, there would be no personal consequences to you but just fuzzy ones for society at large?

05.11.2025 10:59 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

I think that the root cause of the fertility crisis is the most obvious things, two things I care about very deeply: birth control, and women's liberation. I think the death of boredom plays a role too.

05.11.2025 10:59 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Yes, it's a little known fact that "Rock" was actually used to refer to a solid aggregate of minerals before it came to be used as a music genre. Could you perhaps propose some alternate name we could use for that one? And "country" was already used to refer to nation states...

05.11.2025 10:26 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Blame me!

26.10.2025 19:06 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

I think the list of things that work in ML if you have a dataset with perfect coverage over the problem space is very large!

15.10.2025 22:40 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Cartridges are cool, but you need a good dataset right? Like, your training dataset needs to cover all of your latent space. You can write a prompt that is general for all animals, and then collapse it into a cartridge training on most of them, but if you left out emus would emu queries still work?

15.10.2025 22:37 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Thanks Tim! I would also mention though, even if you are rewriting 90% of the code, you can still get a 10% speed improvement just by using this feature. People slave away in the cuda mines for a 10% speedup, and here it is, sitting right in front of you, with 10% even in the WORST case.

10.10.2025 19:03 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

I don't do social media so nobody is going to read this, so I'll just @ some of my favorite LLM bsky accounts begging for some reskeets.
@timkellogg.me @cameron.pfiffer.org @timfduffy.com @natolambert.bsky.social @howard.fm

10.10.2025 18:11 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

The change is in our experimental vLLM fork vLLMx for the moment, but we will be submitting a PR to vLLM main shortly.

10.10.2025 18:11 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

There is already support for this in the OpenAI api specification, and this change brings it to vLLM in a much better form. OpenAI is actually the only other provider I'm aware of providing this feature, and it actually results in SLOWER performance, while ours is much faster.

10.10.2025 18:11 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
VLLM Predicted Outputs

Blog Post: cascadetech.ai/blog/vllm-pr...
Demo: app.cascadetech.ai

Think: Speculative decoding, but instead of a draft model (slow, complicated, wrong) you have a static text prediction of the output, and a diff algorithm to keep it aligned when it diverges.

10.10.2025 18:11 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

I would like to share some work we've been doing at cascadetech.ai: Predicted Outputs in vLLM. If you aren't familiar with PO, it allows you to dramatically speed up generation when you know something about the contents of the output (think: code modification).

10.10.2025 18:11 β€” πŸ‘ 11    πŸ” 3    πŸ’¬ 1    πŸ“Œ 1

Yep

11.09.2025 23:04 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Ico

11.09.2025 05:35 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

These are INCREDIBLY complex simulations, including multi threading, physics simulations, and many billions of floating point operations that have to be deterministic down to the last bit of the mantissa over the course of hours of play. And they are.

11.09.2025 05:33 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

As a former videogame developer, I can tell you that you can definitely build deterministic software on cpus! One really efficient way to do multiplayer is to replicate input across all nodes and then run a fully deterministic simulation on each node.

11.09.2025 05:33 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

It's hilarious how many accounts on blue sky are just literally an onion article about accounts on blue sky.

07.09.2025 07:59 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Not to mention the Linux tradition of everything being a text stream is very conducive to LLM integration. I just installed desktop Linux on my new computer, pretty happy with it so far. Mug smoother experience than the last time I tried.

07.09.2025 07:58 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

But there isn't some OTHER festival down the road for people in their 20s. ALL music festivals are for gen x and older millennials. Rock and roll is dying, but even live music is dying with it.

21.08.2025 06:04 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

There is no genre of music where at live shows you see a crowd aged under 30. I just went to a music festival La Route du Rock which would have been 20 year olds 20 years ago. Now it was mostly people over 40.

21.08.2025 06:04 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Had a long discussion about this with my ethnomusicologist friend last week who teaches history of rock to 18 year olds. Apparently not only are they not forming bands, but they also aren't even attending live music events at all.

21.08.2025 06:04 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

What is ttc?

12.08.2025 05:50 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

This HAS to be AI slop that humans didn't catch right? Like, this is bullish for gpt-5?

07.08.2025 20:45 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Wow, amazing. Are you going to explore beyond ulaanbaatar? The country is incredible but the capital city is not remotely representative.

20.06.2025 18:17 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

What they are telling those employees is the honest truth, irrespective of their company's business practices.

18.06.2025 13:33 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

or "Don't bother learning how to effectively use AI, we will still keep employing you even when other prospective employees will work more efficiently for the same salary"?

18.06.2025 13:33 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 2    πŸ“Œ 2

or "Don't worry, even as the employees of our competitors adopt AI to be more efficient, we'll just keep doing things the old way so that we don't have to lay anybody off?"

18.06.2025 13:33 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 2    πŸ“Œ 0

Sorry, what exactly would YOU say if you were those CEOs? "Don't worry, AI isn't going to affect employment in our industry?"

18.06.2025 13:33 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 2    πŸ“Œ 1

have you tried using gemini instead of anthropic? in my experience you can get better quality for a TINY fraction of the price. gemini 2.0 flash lite is 10x cheaper than haiku, and flash 2.0 is like 8x cheaper.

02.06.2025 07:37 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

@saganite is following 19 prominent accounts