Stuart Gray's Avatar

Stuart Gray

@sgray.bsky.social

He/Him. AI Wrangler. Web Geek. F1 Fan. All views my own. ๐Ÿค– AI, LLMs, GenAI, NLP ๐Ÿ Python Dev ๐Ÿš€ Indie Hacker ๐ŸŽฎ Game Dev, ProcGen, Unity, C# ๐ŸŽ๏ธ F1 Fan ๐Ÿ‡ฌ๐Ÿ‡ง UK Based ๐Ÿฆฃ mastodonapp.uk/@StuartGray โœ–๏ธ x.com/StuartGray (inactive)

561 Followers  |  1,369 Following  |  990 Posts  |  Joined: 06.02.2024  |  1.7761

Latest posts by sgray.bsky.social on Bluesky

Preview
Harness engineering: leveraging Codex in an agent-first world By Ryan Lopopolo, Member of the Technical Staff

openai.com/index/harnes...

i have been mildly obsessed with the ideas in here for the past month or so.

incredibly validating, total victory: my suspicions are correct, actually. "This functions like garbage collection" is a sentence I have said out loud. craziness

13.02.2026 16:17 โ€” ๐Ÿ‘ 114    ๐Ÿ” 14    ๐Ÿ’ฌ 12    ๐Ÿ“Œ 3

Sure, I didnโ€™t mean on a technical level, more that none of the frontier model providers would go near it (except maybe grok), because it would kill off any alignment & safety features.

The models that have done this tend to be smaller, for local use.

13.02.2026 20:34 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Post image Post image

Artificial Humanities is a bestseller on Amazon's kindle! Snatch it while it's free :)

11.02.2026 19:32 โ€” ๐Ÿ‘ 5    ๐Ÿ” 2    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

LLMs can use similes and make allusions; they can be vivid and concrete, &c.

But they cannot spend 100 pages making you think Wickham is the charming love interest while inserting deniable clues that willโ€”only in retrospect!โ€”reveal you should have known heโ€™s a cad.

Theyโ€™re not trained to mislead.+

13.02.2026 12:02 โ€” ๐Ÿ‘ 47    ๐Ÿ” 7    ๐Ÿ’ฌ 6    ๐Ÿ“Œ 1

Not only that, itโ€™s hard to see how you could train an LLM to embody these qualities *and* not fall foul of alignment issues.

I mean, if you encouraged all of this things, an LLM or user could use them to freely work around any restrictions or safety guards without any complex jailbreaks.

13.02.2026 14:47 โ€” ๐Ÿ‘ 3    ๐Ÿ” 0    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 0

In some respects this should be entirely unsurprising.

All of the qualities that combine to make an interesting story like intrigue, mystery, deception, misdirection, etcโ€ฆ would be absolutely *terrible* qualities in a helpful personal assistant!

LLMs are actively trained against them.

13.02.2026 14:44 โ€” ๐Ÿ‘ 5    ๐Ÿ” 1    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

Having seen what agentic frameworks can do for coding, i suspect that a specialised fiction focused writing equivalent, combined with the right set of agentic skill definitions *might* just be able to pull it off.

13.02.2026 14:11 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

that roleplay tools generally struggle with - developing, and advancing a plot, drawing a scene to a close, etcโ€ฆ

My biggest takeaway has so far has been that you likely need a combination of modes; planning for plot & constraints, short roleplays for character work, but somethings still missing.

13.02.2026 14:09 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

Characters definition in the prompt - bad for token caching, but better for believability as they donโ€™t get exposed to โ€œprivateโ€ character backstory.

Once you start to see things like this, you can see varying degrees of it everywhere, not just chars.

And thatโ€™s before you get to main weaknesses

13.02.2026 14:09 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

If you havenโ€™t already, you should probably try role playing a bit first.

Itโ€™ll give a you a good insight into some of the problems LLMs have with story writing and ways to try and work around it with varying degrees of success.

E.g. SilliTaverns group chat defaults to only including the current

13.02.2026 14:09 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Preview
Meta Plans to Add Facial Recognition Technology to Its Smart Glasses

Facebook plans to put facial recognition in its glasses and they think weโ€™re too stupid to fight back.

Their internal memo: โ€œWe will launch during a dynamic political environment where many civil society groups that we would expect to attack us would have their resources focused on other concerns.โ€

13.02.2026 12:16 โ€” ๐Ÿ‘ 1613    ๐Ÿ” 921    ๐Ÿ’ฌ 91    ๐Ÿ“Œ 294
Preview
AutoDiscovery: Open-ended Scientific Discovery via Bayesian Surprise ArXiv link for AutoDiscovery: Open-ended Scientific Discovery via Bayesian Surprise

AUTODISCOVERY leverages Bayesian surprise for hypothesis generation, achieving 5-29% more unexpected findings than previous methods, promising to accelerate scientific exploration and transform AI-driven research. https://arxiv.org/abs/2507.00310

13.02.2026 10:11 โ€” ๐Ÿ‘ 1    ๐Ÿ” 1    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 1
Post image Post image

i am linking this to the next person who tells me that llms do not make money remotely comparable to their investment. anthropic's annual revenue is 1/2 the size of their funding round

also lol a billion dollars of that is claude code subs they've sold in the last month

13.02.2026 06:28 โ€” ๐Ÿ‘ 149    ๐Ÿ” 16    ๐Ÿ’ฌ 13    ๐Ÿ“Œ 2
Preview
UK ban on Palestine Action unlawful, high court judges rule Protest groupโ€™s co-founder wins legal challenge against decision to proscribe it under anti-terrorism laws

Wow!

Palestine Action win judicial review - Guardian report.

This is a *big* legal win.

www.theguardian.com/uk-news/2026...

13.02.2026 10:11 โ€” ๐Ÿ‘ 1777    ๐Ÿ” 754    ๐Ÿ’ฌ 24    ๐Ÿ“Œ 0
Post image

Introducing FragCoord: My ultimate shader editing tool!

13.02.2026 02:20 โ€” ๐Ÿ‘ 235    ๐Ÿ” 69    ๐Ÿ’ฌ 10    ๐Ÿ“Œ 6
Preview
A new AI-powered weather model could be key to the future of your forecast โ€“ but thereโ€™s a catch | CNN Accurately predicting the weather is hard โ€” really hard, but a new AI-powered forecast model just hit a milestone that has experts saying your forecast could soon get more accurate, and further out, t...

Significantly more accurate up to 15 days ahead, thereโ€™s plenty of coverage available including a paper in Nature:

edition.cnn.com/2024/12/13/w...

12.02.2026 23:21 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
Preview
WeatherNext 2 WeatherNext 2 is our most accurate AI weather forecasting technology.

Thatโ€™s not true in the slightest. Thereโ€™s a wide range of AI systems, some of which are specifically designed for predicting things and can be quite good at it.

Take Googles recent AI weather forecaster: 8x faster than conventional models, 5-10 times longer range

deepmind.google/science/weat...

12.02.2026 23:01 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 0
Preview
The $285 Billion 'SaaSpocalypse' Is the Wrong Panic AI labs arenโ€™t winning by intelligence alone, SaaS isnโ€™t dying, and the real battle is over who becomes the system of action in the agentic enterprise. That battle is a symetric one.

tl;dr - be a system of record

www.decodingdiscontinuity.com/p/285-billio...

12.02.2026 16:15 โ€” ๐Ÿ‘ 5    ๐Ÿ” 1    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

There isnโ€™t one and they know it, which is why theyโ€™re trying to move up the stack and integrate with apps, services, and enterprise suites before they run out of runway.

12.02.2026 20:10 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Why the fuck do companies pull this shit?

Itโ€™s already hard to enough to pick a frontier AI provider based on their principles & values, and Anthropic was about the only reasonable contender for those who cared.

Time to re-evaluate Mistral & the Chinese I guess ๐Ÿคฌ

12.02.2026 16:03 โ€” ๐Ÿ‘ 5    ๐Ÿ” 2    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

I love this.

But also, the crazy small size makes it seem like one of those old school โ€œcode golfโ€ challenges to see how small it can be made & still function ๐Ÿ˜‚

12.02.2026 14:10 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Preview
microgpt microgpt. GitHub Gist: instantly share code, notes, and snippets.

@karpathy.bsky.social 's microgpt.py

Train and inference GPT in 243 lines of pure, dependency-free Python.

gist.github.com/karpathy/862...

11.02.2026 23:56 โ€” ๐Ÿ‘ 83    ๐Ÿ” 14    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 3

Rates dogs guy is all out of fucks

12.02.2026 12:28 โ€” ๐Ÿ‘ 7    ๐Ÿ” 3    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Preview
How I used Claude Code in a real data journalism project This morning three colleagues and I published a story outlining how the federal government is using AI. Hereโ€™s how I used Claude Code to help. Agencies are required to publish a spreadsheet of AI use ...

New blog post: A case study on using Claude Code in data journalism ->
kschaul.com/post/2026/02...

09.02.2026 21:36 โ€” ๐Ÿ‘ 17    ๐Ÿ” 5    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 0

You can turn on โ€œrequire alt textโ€ for images in the Bluesky accessibility settings to prevent that

12.02.2026 09:52 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

Itโ€™s also very weirdly blinkered in ignoring Anthropicโ€™s own well regarded research into how its own models work?

Itโ€™s a bizarre level of corporate self-ignorance.

I get, and applaud to a degree, Anthropicโ€™s free reign, but surely she should be a *bit* curious about what her colleagues *do* know?

11.02.2026 16:52 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Embedding Links in Bluesky Posts Learn how to embed website links in your Bluesky posts with rich previews while saving character space.

E.g. www.bskyinfo.com/guides/getti...

11.02.2026 16:46 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Actually, you might be able to squeeze a little more data out by using a custom link embed?

Link to a service the generates an embed with the extra data, before deleting the URL from the post.

Iโ€™m not sure if/how Bluesky persist embed info though? I presume it does because the url gets deleted?

11.02.2026 16:45 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

Ahh, for a minute I thought they were getting all fancy and suggesting that maybe a data URL would take advantage of some hidden Bluesky rules to save precious chars.

But yeah, offsite is kinda cheating. You could abuse image alt text too, but thatโ€™s another bad smell :/

11.02.2026 16:35 โ€” ๐Ÿ‘ 2    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

I know youโ€™re joking, but now youโ€™e got me wondering what kind of post size a custom client with compression support could reach & still stick within the underlying ATProto constraints to remain compatible with Bluesky ๐Ÿค”

11.02.2026 14:17 โ€” ๐Ÿ‘ 2    ๐Ÿ” 1    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 0

@sgray is following 20 prominent accounts