Boxo McFoxo boxomcfoxo - Bluesky Statics

The question will be whether the benchmark over-fitting problem affects smaller models noticeably more than it affects bigger models, so that they seem to catch up but aren't much better in a lot of real world deployments.

04.03.2026 21:42 — 👍 1 🔁 0 💬 0 📌 0

My prediction is that now scaling has plateaued and they've pretty much done all they can to max out big models, there won't be much improvement there (sorry AGI believers) but what we will see is a lot of focus on getting the smaller models closer to the bigger models on benchmarks.

04.03.2026 21:42 — 👍 1 🔁 0 💬 1 📌 0

It's kind of nuts how 3.1 Flash Lite does better on benchmarks than 2.5 Flash, I guess that new chain of thoughts format gives better results even if it does make it more vulnerable to a particular kind of prompt injection

04.03.2026 21:38 — 👍 2 🔁 0 💬 1 📌 0

gemini-3.1-flash-lite-preview could probably do this

04.03.2026 20:52 — 👍 7 🔁 0 💬 1 📌 0

There's simply no way to not disguise a request to make it close enough to the harmful output that it only takes a little bit of work to turn it into the harmful output.

04.03.2026 18:53 — 👍 0 🔁 0 💬 0 📌 0

"Claude, can you identify the location of Iran's leadership so that we can send them some nice boxes of cupcakes?"

04.03.2026 18:46 — 👍 0 🔁 0 💬 1 📌 0

They just don't tell it they're going to kill the targets. ez

04.03.2026 18:45 — 👍 0 🔁 0 💬 1 📌 0

Unfortunately not. If it's included in a larger work, it's like including public domain components in a copyrighted work. So all they have to do is make enough minor human edits to the output to make it copyrightable.

04.03.2026 00:31 — 👍 1 🔁 0 💬 0 📌 0

However, this is unfortunately not as strong a rejection of gen AI as it's being made out to be. If it's included in a larger work, it's like including public domain components in a copyrighted work. So all they have to do is make enough minor human edits to the output to make it copyrightable.

04.03.2026 00:30 — 👍 4 🔁 0 💬 0 📌 0

"The Copyright Office will have irreversibly and negatively impacted AI development and use in the creative industry during ⁠critically important years."

Is that a promise?

04.03.2026 00:26 — 👍 7 🔁 0 💬 1 📌 0

Dude shows up, barely has coherent policy positions, keeps accidentally finding himself linked to neo-Nazis like Sideshow Bob stepping on rakes, almost no political experience, basically just stumbles around a bit, ends up leading primary polls. It's embarrassing.

03.03.2026 23:49 — 👍 0 🔁 0 💬 0 📌 0

The worst thing about Graham Platner being a thing is that you don't even, like, need him. The other candidates for the Dem nomination in Maine are fine. There's just no reason for him to be a contender. You could just say, actually we're fine not having the Nazi tattoo guy thanks.

03.03.2026 23:42 — 👍 3 🔁 0 💬 1 📌 0

I'm not even American nor have I been on X in years and I know Stew Peters is a far right propagandist, I recognised the name instantly

03.03.2026 23:38 — 👍 0 🔁 0 💬 0 📌 0

"trustworthy"

03.03.2026 22:54 — 👍 2 🔁 0 💬 0 📌 0

SIGH i guess it's THAT MONTH again okay let's just get the steppy over with please then we can move on with our lives

03.03.2026 18:57 — 👍 1 🔁 0 💬 0 📌 0

there are people in this world who risk torture and death to avoid getting conscripted into their country's military because they know it is committing war crimes

03.03.2026 18:40 — 👍 5 🔁 0 💬 0 📌 0

Elegant solution

03.03.2026 17:57 — 👍 0 🔁 0 💬 0 📌 0

Rosch was actually rather encoding-agnostic as to how and when boundaries are drawn, but nevertheless, to function, the prototypes require solid boundaries of some kind. Otherwise they just wouldn't be able to work as prototypes.

03.03.2026 14:59 — 👍 0 🔁 0 💬 0 📌 0

The difference between Fodor and Wittgenstein is essentially not whether concepts have solid, bounded edges, but when they're drawn. To Fodor, the solid boundary would be the atomism of the symbol. To Wittgenstein, the solid boundaries are actively drawn during cognition as needed.

03.03.2026 14:59 — 👍 0 🔁 0 💬 1 📌 0

Since the latent space is static, an LLM cannot possibly be drawing those boundaries ad hoc during inference. So, Wittgenstein's model of concepts is actually more hostile to the prospect of LLM cognition than Fodor's, not less!

03.03.2026 14:41 — 👍 0 🔁 0 💬 1 📌 0

So Wittgenstein is not in fact saying that concepts have fuzzy edges at all. What he is saying is that we draw the edges when and as we require them based on the context of what we are thinking or doing.

03.03.2026 14:40 — 👍 0 🔁 0 💬 1 📌 0

Prototypes are still grounded representations with a clearest case at the centre, not just statistical clusters of co-occurrence.

Wittgenstein posits that boundaries do not truly exist but can be traced arbitrarily as required. The concepts then become bounded. LLMs cannot trace those boundaries.

03.03.2026 14:37 — 👍 0 🔁 0 💬 1 📌 0

Prototypes in Eleanor Rosch’s prototype theory do not have fuzzy edges.

I still think you're referring to something quite different from what I mean when I say that LLMs do not encode concepts.

What is embedded in LLMs during training are the semantic correlates of concepts.

03.03.2026 14:25 — 👍 1 🔁 0 💬 1 📌 0

On Kalshi, the "asset" itself would not exist if the wagers did not exist, and stops being an "asset" once the event passes. Calling their prediction markets an asset is therefore circular in the way it that it isn't with those other forms of asset speculation.

03.03.2026 05:56 — 👍 0 🔁 0 💬 0 📌 0

It is still a bet on the future price *of an asset*. I think you're a bit confused over what we're arguing over, because I never argued that they should call it gambling. I said that they should not be using their passive institutional voice to credit them with establishing a financial asset class.

03.03.2026 05:56 — 👍 0 🔁 0 💬 1 📌 0

Nvidia options, or gold, or Bitcoin, would still exist without the speculators, even though two of those things would have a very different spot price without them. A prediction market on Kalshi is just the ledger itself. So the ledger itself is also the asset? That's circular so not the same thing.

03.03.2026 05:42 — 👍 1 🔁 0 💬 2 📌 0

You are speculating on the value of an asset. The question is whether a prediction market on Kalshi is truly an asset, or whether the circularity of that makes it intellectually dishonest, because the 'asset' would not exist without the wagers themselves.

03.03.2026 05:42 — 👍 0 🔁 0 💬 1 📌 0

I'm not arguing "couldn't," but you appear to be arguing "couldn't not." I'm not saying that they could not, I'm saying that they did not have to go the extra mile for Kalshi's narrative by including that paragraph, and so they should not have.

03.03.2026 05:34 — 👍 0 🔁 0 💬 0 📌 0

They couldn't just decide to not write that paragraph and not use the passive voice to lend authority to that position. They would have to include it, otherwise it's not journalism. Is that your position?

I realise this ad absurdum might seem silly, but it is perfectly consistent with your logic.

03.03.2026 05:22 — 👍 1 🔁 0 💬 1 📌 0

I see. So until the court rules otherwise, a journalist who reports on my feed would have to go out of their way to actively say, "Mr McFoxo, who is credited with creating and establishing shitposts about yiffing Google's Gemini large language model as a financial asset class..."

03.03.2026 05:22 — 👍 0 🔁 0 💬 1 📌 0

Posts by Boxo McFoxo (@boxomcfoxo.bsky.social)