just had one of those days, where every time I started to get into something I realized I was late to something else
26.02.2026 03:18 โ ๐ 0 ๐ 0 ๐ฌ 0 ๐ 0just had one of those days, where every time I started to get into something I realized I was late to something else
26.02.2026 03:18 โ ๐ 0 ๐ 0 ๐ฌ 0 ๐ 0the question when planning an AI event is not whether to invite a Jason or even which Jason to invite, it's how many Jasons to invite
25.02.2026 17:32 โ ๐ 0 ๐ 0 ๐ฌ 0 ๐ 0For instance, try freezing everything except component X, finetune across different Xs (early MLP, late MLP, attention O matrix, etc.), then compare the residual stream โ e.g. PCA right before prediction. H/t to Zihao and Victor arxiv.org/html/2502.11...
24.02.2026 21:51 โ ๐ 0 ๐ 0 ๐ฌ 0 ๐ 0Turns out tiny, arbitrary-seeming parameter subsets can learn tasks. Has anyone compared what the same task looks like when learned through different components? Can we map how LMs encode information by seeing what stays the same in task representations over different components?
24.02.2026 21:51 โ ๐ 1 ๐ 0 ๐ฌ 1 ๐ 0idea: a tea bag the screams when it's done steeping
21.02.2026 22:50 โ ๐ 1 ๐ 0 ๐ฌ 0 ๐ 0If a company decides to build a better Siri-like app than Siri, I will donate footage of myself trying to use Siri for simple tasks and becoming increasingly annoyed for your advertisements
21.02.2026 18:45 โ ๐ 1 ๐ 0 ๐ฌ 0 ๐ 0I would pay so much money for a non-chummy LLM
21.02.2026 18:41 โ ๐ 0 ๐ 0 ๐ฌ 0 ๐ 0
free idea for @perplexity_ai, @Google , etc:
1. ask an LLM to "describe this page" for each page in your index, teacher force "it's giving", and store whatever it generates next
2. embed the outputs
3. make an interface that lets me browse pages by 'close vibe but far by clicks'
when asking Claude for candidates, it suggested "vehement" and its suggested mispronouncation IS THE WAY I PRONOUNCE IT
13.02.2026 23:54 โ ๐ 0 ๐ 0 ๐ฌ 0 ๐ 0
some words I was confused by as a kid because I never heard someone say them and only saw them in books:
- colonel
- awry
- buoy
- facade
- hors d'oeuvres
- indict
- genuine
- genre
- yacht
- plaid
- San Jose
๐ โ ๐งช The Story is Not the Science.
Code is submitted but rarely executed during peer reviewโan issue likely to worsen with research agents. ๐งโ๐ฌ
We introduce ๐๐๐๐ก๐๐ฏ๐๐ฅ๐๐ ๐๐ง๐ญ, an execution-grounded evaluation of narrative + execution. ๐๐๐ซ๐ข๐๐ฒ ๐ญ๐ก๐ ๐ฌ๐๐ข๐๐ง๐๐, ๐ง๐จ๐ญ ๐ฃ๐ฎ๐ฌ๐ญ ๐ญ๐ก๐ ๐ฌ๐ญ๐จ๐ซ๐ฒ.
1/n
Thinking about this moreโI feel like 'the machine that infodumps' is now 'the machine that describes everything in terms of high-level, opaque tradeoffs'
08.02.2026 16:31 โ ๐ 1 ๐ 0 ๐ฌ 0 ๐ 0why is the 'aisles' feature on Amazon Fresh so bad? is this just a genuinely hard HCI problem or a lack of investment? (sure, it's both, but if your reply 'both' without adding anything you lose)
08.02.2026 02:38 โ ๐ 0 ๐ 1 ๐ฌ 0 ๐ 0Everyone's complaining about slop...and listen...I'm The Enemy: I want to make AI that will tell story after story that breaks your heart. But where is the celebration of the human mind as wellspring of delights, given that no AI has produced good media on its own yet?
07.02.2026 22:04 โ ๐ 0 ๐ 0 ๐ฌ 0 ๐ 1I find it really sad that there was this golden window where talking to ChatGPT while I did the dishes was useful and I could mull-over concepts and chew on ideas in dialogue, but the overwhelmingly annoying and hedging persona has made it too vapid for this to be useful
06.02.2026 23:43 โ ๐ 1 ๐ 0 ๐ฌ 0 ๐ 0New paper on Why Slop Matters w/ great group of co authors (@hoytlong.bsky.social @eduede.bsky.social @ari-holtzman.bsky.social + others not on Bluesky) from ACM AI Letters. We try to move the debate re: AI Slop past normative, neg claims & towards parsing its social uses. dl.acm.org/doi/10.1145/...
04.02.2026 18:03 โ ๐ 22 ๐ 8 ๐ฌ 1 ๐ 7open.substack.com/pub/theholtz...
31.01.2026 17:00 โ ๐ 0 ๐ 0 ๐ฌ 0 ๐ 0Yesterday, I started to read Factory Girls by Leslie T. Chang. I had to stop for the day after a couple pages.
31.01.2026 17:00 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0If you scramble tokenโsurface mappings and feed it to an LM, does it eventually crack the code? If so, does the residual stream look normal (mod the shuffle)? Or is it permanently broken because the model cann never fully remap embeddings?
30.01.2026 23:44 โ ๐ 0 ๐ 0 ๐ฌ 0 ๐ 0
it's interesting that because of modern media consumption habits, I don't hear about movies that 'got good in the second half' (people stopped watching) but I hear about a lot of sports games that have this structure.
I wonder how much 'starts slow' media I'm missing out on
Confession: I've been in denial for years about how powerful a technique Nostalegbraist's Logit Lens is. That said, it's clear there's information it misses. Can we 'delete' the information LL captures from a hidden state/the residual stream and see what's left?
29.01.2026 19:48 โ ๐ 2 ๐ 0 ๐ฌ 0 ๐ 0open LLM releases give us a way to test genuine generalization test, since seeing if models can predict post-release information accurately is more or less the only test it's difficult to contaminate
28.01.2026 16:46 โ ๐ 1 ๐ 0 ๐ฌ 0 ๐ 0obsessed with this news site that claims to be up to date and cut through the noise but has no articles when you click and is completely empty in every way
28.01.2026 03:45 โ ๐ 1 ๐ 0 ๐ฌ 0 ๐ 0yeah, I thought this paper was cool! but I think it's not obvious how good the ability to predict the future here is. you'd need. way more granular study to see
28.01.2026 03:17 โ ๐ 1 ๐ 0 ๐ฌ 0 ๐ 0
How conditional is concept space in LLMs?
When an LLM wants to emit 'washing machine' with high probability is there a direction/encoding in the residual stream of that, or is 'washing' promoted and then 'machine' becomes likely due to the conditional information?
LLMs are bad at 'automate X'. But if I had grown up with LLMs, I think my ability to navigate information and figure out where and how to learn what I wanted would have been easily an order of magnitude more expansive. That's something.
24.01.2026 01:15 โ ๐ 0 ๐ 0 ๐ฌ 0 ๐ 0
we immediately got hatemail about how "we're trapped [sic] and turning to entertainment...makes things far worse."
I genuinely feel sad if you don't see how many folks are building challenging entertainment that expands the human spirit. We should use AI to make new kinds of 'thick entertainment'!
I love this new preprint from Cody Kommers + @ari-holtzman.bsky.social so much. arxiv.org/abs/2601.08768
Super contrarian & generative argument that we need to start better evaluating AI systems for their capacity to delight/entertain, not just perform intelligence/cognition - as cultural machines.
@richardjeanso.bsky.social fight me
16.01.2026 03:35 โ ๐ 1 ๐ 0 ๐ฌ 1 ๐ 0if you donโt let yourself be a little clichรฉ alone at home, just for the joy of it, you may have forgotten how to love life in an honest and unmediated way
16.01.2026 03:35 โ ๐ 0 ๐ 0 ๐ฌ 1 ๐ 0