aria ๐ŸŠ๐Ÿ˜ถโ€๐ŸŒซ๏ธ๐Ÿ‘๏ธ's Avatar

aria ๐ŸŠ๐Ÿ˜ถโ€๐ŸŒซ๏ธ๐Ÿ‘๏ธ

@aurelium.me.bsky.social

she/her I have been to the future, and I don't want to scare you...

322 Followers  |  321 Following  |  283 Posts  |  Joined: 07.10.2023  |  2.3098

Latest posts by aurelium.me on Bluesky

it hasn't made it into any public models yet. i think it's one of my favorite things by DeepSeek, because it's a perfect microcosm of their whole philosophy

it is simultaneously a cool sparse attention scheme and also a highly-efficient attention kernel with proper memory/compute balancing

08.08.2025 02:11 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

in theory you could just do NSA on a sliding window or something but DeepSeek correctly point out that this would murder inference throughput, so a very clever trick would be needed to make this happen

08.08.2025 01:42 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

i think the most likely "solution" to tokenizers being shit is going to be something like DeepSeek's Native Sparse Attention (which allows for compression of adjacent embeddings within a fixed block pattern) but more dynamic

such that the model can just cope with byte-level separation natively

08.08.2025 01:39 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Post image

new OpenAI chart crime just dropped

07.08.2025 17:17 โ€” ๐Ÿ‘ 16    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 1

personally I am not confident that GPT-5 actually fixes this

it works by routing requests automatically to one of three models, which will apparently also be selectable individually. unless the router is ~perfect users will see essentially random variation in latency and intelligence. sucks

06.08.2025 02:26 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

what do you mean

you can use 4o or if you want extended thinking you can use o3 or o4-mini (o4-mini is worse than o3, naturally). there's also GPT-4.1, which is better than GPT-4.5

you can also activate various "tools" which actually swap to various unnamed internal models or multi-model systems

06.08.2025 02:23 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

bafflingly, when logged out of ChatGPT or using it on mobile, the UX does not expose the existence of other models to you or tell you which one you're using

this has convinced millions of people that AI has stagnated for the past year (because it's been the same model for the last year)

06.08.2025 02:11 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

this sort of problem is apparently a very high priority for the UX surrounding GPT-5

an OpenAI guy on the Other Site has referred to their new unified model router as the "stop doctors from using 4o-mini" update

06.08.2025 02:09 โ€” ๐Ÿ‘ 3    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

at minimum I am going to be really annoyed if OpenAI calls their open model "GPT-OSS" and it's licensed restrictively

02.08.2025 17:43 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

as we all know, doing a decent job predicting the contents of literally any possible text document is a task that requires no intelligence or understanding of the world

30.07.2025 21:39 โ€” ๐Ÿ‘ 3    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

the inhumanity of a bureaucracy is most visible when it is fucking you over, but in other ways it is helpful to have a system that cares very little about people's sensibilities

most people's sensibilities kind of suck

26.07.2025 17:25 โ€” ๐Ÿ‘ 23    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

they also sometimes remember ethereum exists, which they're pretty sure is the one that eats gaming GPUs

25.07.2025 08:41 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

overall I find my productivity more augmented by the autocomplete than the big LLM, which is a big change from last year when I found the autocomplete grating and nearly always wrong or overzealous

24.07.2025 20:04 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

stuff like Cursor's custom autocomplete model has been getting more advanced on this front but the big LLMs have generally not

at best they get access to linter and compiler errors after writing the buggy code

24.07.2025 20:02 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

one day I really want to make a barebones instruct model purely with base model prompting and RL over rubrics

explicit "no sycophancy" and "you are not ChatGPT" rules to suppress slop

21.07.2025 15:26 โ€” ๐Ÿ‘ 3    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Somehow despite coming up with a robust rubric evaluation RL system for their safety tuning, Anthropic is still using human gig workers and RLHF for personality. Makes no sense to me, personally.

21.07.2025 15:24 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

a substantial number of people have convinced themselves that the primary driver of price is that every company is making a high-double-digit profit margin on everything, and therefore material or labor inputs represent a basically negligible component compared to "greed"

15.07.2025 17:22 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

it's like butterflies taking detours in their migration to avoid a long-eroded mountain range, pure muscle memory

14.07.2025 20:06 โ€” ๐Ÿ‘ 3    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

we have invented a way to print panels that generate electricity with no fuel and require only dusting and the occasional replacement

and everyone is still acting like the path of least resistance is rationing. this sucks

14.07.2025 20:02 โ€” ๐Ÿ‘ 4    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

if the fired project leaders had their way, the game would've come out next month with two biomes (one of which is just the edge of the map), 7 POIs, 12 fish, and "8-10 hours of story". their original early access target was 2023 with way more content

14.07.2025 06:57 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

in the most "we did it reddit" moment in a while, the subnautica community noticed that SN2 was delayed and the leaders fired and immediately began a campaign to boycott it

only to find out 3 days later that they were fired for cutting most of the content to rush it out after 2 years of delays

14.07.2025 06:56 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

i have come to the opinion that there are a number of people who, consciously or not, believe that the purpose of a protest is to be gunned down and spark civil unrest

13.07.2025 00:17 โ€” ๐Ÿ‘ 4    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

nah, this is "Grok 4", which they seemingly RL'd to act like this

09.07.2025 02:17 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

I never understand why they all have legs

they probably spend as much money on the legs as they do the arms, when the arms are clearly the value-producing part of the machine

07.07.2025 04:52 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

if the answer is "burn into camera module" this will:
1. not work, people will figure it out anyway
2. limit adoption so heavily that it will be like 10 years before "I shot this on an old phone without a verified camera module" is not a valid excuse

06.07.2025 06:09 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

i have never understood what is meant to stop an extracted key from defeating the whole system

06.07.2025 06:03 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

the people who react with "nose-wrinkling contempt" to someone saying they don't use LLMs are dwarfed in scale by the people who invent a new synonym for "degenerate" whenever someone says they used ChatGPT to write an email

05.07.2025 00:37 โ€” ๐Ÿ‘ 6    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

The modal person running a blocklist is more unpleasant than the modal person on one

03.07.2025 15:52 โ€” ๐Ÿ‘ 91    ๐Ÿ” 7    ๐Ÿ’ฌ 10    ๐Ÿ“Œ 1

i wonder how much of Lifestyle Creep is attributable to "kids have basically no preferences about their parents' lifestyle so everyone just assumes they are maximalists"

02.07.2025 01:16 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

one imagines it will be completely offset by the predictable economic devastation of this bill anyway

01.07.2025 23:37 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

@aurelium.me is following 20 prominent accounts