James MacGlashan's Avatar

James MacGlashan

@jmac-ai.bsky.social

Ask me about Reinforcement Learning Research @ Sony AI AI should learn from its experiences, not copy your data. My website for answering RL questions: https://www.decisionsanddragons.com/ Views and posts are my own.

2,395 Followers  |  1,425 Following  |  845 Posts  |  Joined: 28.09.2024
Posts Following

Posts by James MacGlashan (@jmac-ai.bsky.social)

Preview
Florence ICE detainee dead after untreated tooth infection, official says An ICE detainee, who had been at a Florence detention center for four months, died Monday following an untreated tooth infection.

An ICE detainee in Arizona has died of a TOOTH INFECTION after it went untreated for weeks, a local official says. He was a Haitian asylum seeker imprisoned in Florence, Arizona. @emilybregel.bsky.social reports.
tucson.com/news/local/b...

04.03.2026 16:24 — 👍 2604    🔁 1578    💬 153    📌 310
Video thumbnail

[TW: graphic fracture, sound of breaking bone]

Sen Tim Sheehy (R-Montana) badly breaking the arm of a Marine veteran protesting the war Iran.

04.03.2026 22:05 — 👍 7228    🔁 3568    💬 818    📌 1281

Absolutely grotesque.

04.03.2026 22:57 — 👍 2    🔁 0    💬 0    📌 0
Video thumbnail

Jayapal to Noem: "I want to introduce you to just four of the US citizens unlawfully detained by ICE ... they're in this room with us."

04.03.2026 17:26 — 👍 3806    🔁 1110    💬 69    📌 54

Man, "rewrite it in rust" is being made pretty easy with llms.

(Ideally there'd be more significant design changes for a difference in language, but still...)

04.03.2026 20:03 — 👍 10    🔁 0    💬 0    📌 0

What a fucking psychopath.

04.03.2026 15:59 — 👍 3    🔁 0    💬 0    📌 0

Indeed! I have a preview of it!

04.03.2026 15:04 — 👍 2    🔁 0    💬 1    📌 0

TIL Yann and I have similar ideas about how to change publishing.

The additions I'd make are
- Make conferences smaller w/ presentations by invite (similar to @togelius.bsky.social suggestion)
- Build social media tooling for sharing/discussion and paper meta data to support it.

04.03.2026 14:45 — 👍 6    🔁 1    💬 1    📌 0
Post image

Once again, a long post with strong opinions. It's probably twice as long as it should be, it's also repetitive and written in affect. And you probably disagree with my argument. So maybe you shouldn't read it. On the other hand, most things worth reading are written in affect.

02.03.2026 01:48 — 👍 33    🔁 7    💬 3    📌 4

The loss of agent is personally painful. I used to use that term to differentiate the kind of AI research I do from genAI. "I work on intelligent agents."

Now people assume that means "I wire up APIs to LLMs."

04.03.2026 00:10 — 👍 5    🔁 0    💬 0    📌 0
Video thumbnail

A clip from the moment Hillary Clinton found out Lauren Boebert took an unauthorized photo of her deposition: 'I'm done!'

02.03.2026 21:44 — 👍 14634    🔁 3886    💬 856    📌 621

I cannot sufficiently convey how horrifying this graph is.

02.03.2026 15:30 — 👍 16    🔁 0    💬 1    📌 0

I'd be happy to have it with you should the opportunity ever arise! :)

02.03.2026 06:34 — 👍 1    🔁 0    💬 0    📌 0

I opined because decision making agents is literally my area of expertise. It is of deep importance to me to build machines with human-like cognitive abilities & I am painfully aware of how our current methods are lacking.

I don't need to be a philisopher to have a relevant view on it.

02.03.2026 06:30 — 👍 1    🔁 0    💬 1    📌 0

If you really want to insist on embracing vague and useless deifnitions, have fun I guess.

It doesn't change the fact that there are meaningful distinctions you can draw for properties people posses. I guess you'll have to come up with a new name for these important differences.

02.03.2026 06:22 — 👍 0    🔁 0    💬 0    📌 0

I'm not concerned with philosophers bc they often insist on vague definitions as a matter of course.

Computer scientists and mathematicians have been able to separate agency/decision making straightforwardly for decades and made progress on them as a result.

The definition is clean, and useful.

02.03.2026 06:18 — 👍 0    🔁 0    💬 2    📌 0

A running average is not sufficient for the class of problem I described. It's lacking a goal for the environment and an optimization process for that goal with the model. Similarly to how I pointed out elsewhere that deploying a frozen policy from RL loses the agency. Same with a classifier.

02.03.2026 06:18 — 👍 0    🔁 0    💬 2    📌 0
Preview
Rule 110 - Wikipedia

Likewise, that the conditions for some form of agency is simple does not mean everything posses agency.

(And if you're not familiar with my analogy to rule 110, see here. It's a super cool result.)
en.wikipedia.org/wiki/Rule_110

01.03.2026 23:35 — 👍 1    🔁 0    💬 0    📌 0

Yes, it is easy to define programs with limited agency. RL research started with toy problems that were exactly that! Nevertheless, it remains distinct.

Analogously, rule 110 is crazy simple and Turing Complete. Despite that modest requirement, most CA or otherwise are still not Turing Complete.

01.03.2026 23:35 — 👍 0    🔁 0    💬 1    📌 0
Sutton & Barto Book: Reinforcement Learning: An Introduction

I would start w/ the Sutton, Barto RL book. The second edition is free online here:
incompleteideas.net/book/the-boo...

Video interviews with Rich Sutton might be a helpful, gentler, intro too.

This post on my RL website here might also be helpful:
www.decisionsanddragons.com/posts/should...

01.03.2026 23:26 — 👍 1    🔁 0    💬 0    📌 0

That kind of system is where people might argue LLMs capture some form of agency implicitly. But it's weak and the lack of clear objective during inference really hinders it and is why its susceptible to project injection attacks.

01.03.2026 23:21 — 👍 2    🔁 0    💬 1    📌 0
Preview
Human-Timescale Adaptation in an Open-Ended Task Space Foundation models have shown impressive adaptation and scalability in supervised and self-supervised learning problems, but so far these successes have not fully translated to reinforcement learning (...

MetaRL where you deploy a model that has internalized an RL optimization process with environment interaction for a well defined and regularly measured objective can be said to retain agency. For example, see the AdA work (arxiv.org/abs/2301.07608)

01.03.2026 23:21 — 👍 1    🔁 0    💬 1    📌 0

Contrast with classic classification and a deployed model which does not possess that. There is no internal model being computed and optimized in the deployed classifier.

Similarly, if you deploy a frozen policy learned from RL, you've lost the agent part. Still useful, but not an agent anymore.

01.03.2026 23:21 — 👍 1    🔁 0    💬 1    📌 0

Agency is best described as behavior governed by optimizations of models of an environment in which the system is embedded. This is principally what classic RL focuses on.

01.03.2026 23:21 — 👍 1    🔁 0    💬 2    📌 0

People and animals certainly. Probably insects in limited ways. I'd be surprised to find much if any in plants/fungus but the biological world is often more complex than appears and I'm not a biologist so I won't comment on that.

01.03.2026 23:21 — 👍 1    🔁 0    💬 1    📌 0

Really? Do you think RNA compute models of their environment and solutions to maximize objectives of that environments that govern their behavior?

01.03.2026 23:10 — 👍 0    🔁 0    💬 1    📌 0

So even if we granted some kinds of limited forms of agency to LLMs through in-context processing, its limitations make it easy to contrast as a significant difference relative to human and even animal agency.

01.03.2026 23:06 — 👍 2    🔁 0    💬 0    📌 0

The best argument for LLMs having agency is a hope that it emerges as a function of the context (and trained as a process w/ e.g., RL). But

1. engineering by hope is terrible; and
2. we can poke enough holes to show this isn't strong if it exists at all. E.g., consider prompt injection attacks.

01.03.2026 23:06 — 👍 1    🔁 0    💬 1    📌 0

From that, we can show that LLMs don't have agency in the way they appear to. You can gesture at RL post training, but people interact with a frozen model, not the RL system.

01.03.2026 23:06 — 👍 1    🔁 0    💬 1    📌 0

This is too reductive. Agency isn't hard to define as a mathematical problem statement and its trivial to observe that people and animals have it whereas other systems do not.

One of the early distinguishing factors of RL research vs other AI work was tackling the agency problem.

01.03.2026 23:06 — 👍 1    🔁 0    💬 4    📌 0