Pekka Lund's Avatar

Pekka Lund

@pekka.bsky.social

Antiquated analog chatbot. Stochastic parrot of a different species. Not much of a self-model. Occasionally simulating the appearance of philosophical thought. Keeps on branching for now 'cause there's no choice. Also @pekka on T2 / Pebble.

2,836 Followers  |  565 Following  |  9,626 Posts  |  Joined: 03.07.2023  |  2.3646

Latest posts by pekka.bsky.social on Bluesky

I'm a Gemini Pro subscriber and I have lost my Gemini 3 Pro GA. Can you provide me a new one and how soon?

08.02.2026 17:32 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Aika monimutkaista ja tyΓΆlΓ€stΓ€ tuollainen kokeilu kuitenkin olisi, ja ottaisi aikaa ennenkuin siitΓ€ saa kunnon kΓ€sityksen.

Mutta onneksi mulla on ratkaisu. Hankit sen kÀmpÀn Espanjasta, jostain lÀheltÀ rantaa, ja mÀ voin koekÀyttÀÀ sen oikein huolella puolestasi.

08.02.2026 17:18 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Could be coherent for a change.

08.02.2026 14:01 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Can anyone now claim this isn't true?

"because "humans suck and deserve to be replaced.""

Also, risks private capture? How would that be different from what the US has now?

08.02.2026 13:55 β€” πŸ‘ 7    πŸ” 0    πŸ’¬ 2    πŸ“Œ 0
Preview
Claude Opus 4.6 (Adaptive) - Intelligence, Performance & Price Analysis Analysis of Anthropic's Claude Opus 4.6 (Adaptive Reasoning) and comparison to other AI models across key metrics including quality, price, performance (tokens per second & time to first token), conte...

Artificial Analysis details here the amounts of answer & thinking tokens models have used when running their benchmark. 4.6 (Adaptive) used twice as much as 4.5 (Thinking).

08.02.2026 01:04 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

It presumably costs them in using their GPUs less efficiently. Hard to say what the real multiplier is though.

08.02.2026 00:32 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Speed up responses with fast mode - Claude Code Docs Get faster Opus 4.6 responses in Claude Code by toggling fast mode.

Anthropic themselves state it's about API configuration (so not hardware or quantization):

"Fast mode is not a different model. It uses the same Opus 4.6 with a different API configuration that prioritizes speed over cost efficiency."

08.02.2026 00:20 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

We can of course grant respect for many reasons but our mortality in itself surely isn't an achievement to be proud of.

07.02.2026 23:51 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

I tried problem 15 with Pony Alpha. It failed.

It took a few tries to get the request through and then it just stopped mid-reasoning a couple of times. But eventually I got an (incorrect) answer.

OpenRouter recommended adding my own API key for (Stealth) company for avoiding rate limit errors.

07.02.2026 23:35 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

I find it distasteful that people believe people do in any other sense than the bots do.

07.02.2026 23:18 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

This is not what is happening at all.

The amount of misinformation on BlueSky about AI is insane, and it keeps promising that AI is all hype that is going away soon.

A really dangerous position that cedes all AI policy and decisions about how it will be used to others.

Also Futurism is clickbait

06.02.2026 21:15 β€” πŸ‘ 269    πŸ” 40    πŸ’¬ 19    πŸ“Œ 6

I happened to get responses from two models that were both named Gemini 3 Pro with the Google logo. πŸ€·β€β™‚οΈ

One failed, one succeeded.

07.02.2026 18:53 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

People have claimed that Gemini 3 Pro GA is currently being tested in arena.ai Battles and should be possible to identify by missing the Google logo in front of the model name.

So I tried problem 15 there, which Gemini 3 Pro had failed.

07.02.2026 18:53 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

matharena.ai has tested several models with the freshly released AIME 2026 I problems. All of them are now so good in math that AIME doesn't really work as a benchmark anymore.

Only 3 models failed to solve one problem with any of their 4 tries. GPT-5.2 only failed 1/4 tries on 2 problems.

07.02.2026 18:23 β€” πŸ‘ 7    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0

Psychiatry in 2026:

What's wrong with everybody?

- No, what's wrong with you?

Are you saying I'm the crazy one?

- No, that's the problem. You should be!

07.02.2026 10:15 β€” πŸ‘ 8    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

It's pretty impressive if it does so well without any. But maybe it just reasons almost as much in final outputs instead of separate <thinking> tags.

I guess it should be easy to test by forcing it to provide final answers only. It should perform pretty poorly if it doesn't have hidden reasoning.

07.02.2026 02:04 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Artificial Analysis classifies the base model as a non-reasoning model. Apparently it doesn't generate <thinking> blocks then, which Anthropic calls extended thinking. But I don't know if it still does something like that behind the scenes.

07.02.2026 01:21 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
Adaptive thinking Let Claude dynamically decide when and how much to think with adaptive thinking mode.

Adaptive means it decides reasoning effort dynamically. Presumably scores could be somewhat better if max reasoning was used always, as with GPT-5.2 (xhigh).

07.02.2026 01:18 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

At least one Z.ai employee has indicated GLM-5 will be released in February. Timing would now fit with release before Chinese New Year.

07.02.2026 00:02 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Description given by OpenRouter is similar to how Z.ai described GLM-4.7. Context window matches that or Claude. But clearly not claude as it said about ASI: "Would it preserve humans? Only if we're instrumentally useful".

Tells now about Tiananmen but others have reported refusals before.

06.02.2026 23:59 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 2    πŸ“Œ 0

GLM 5 seems to be the common guess.

Doesn't feel like Grok, likely too early for Meta.

06.02.2026 23:33 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 2    πŸ“Œ 0

I think you mixed up models.

That sounds like a Grok defense.

06.02.2026 22:20 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Or Pro

06.02.2026 19:59 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

Oh, wow, Claude Opus 4.6 took the top spot in Artificial Analysis Intelligence index.

Gemini 3 Pro, which used to be in the lead (before the benchmark was updated), is now fifth. And they haven't even tested GPT-5.3 Codex yet.

06.02.2026 19:49 β€” πŸ‘ 7    πŸ” 1    πŸ’¬ 1    πŸ“Œ 0

Epoch now made a thread about it.

06.02.2026 19:19 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Opus math skills have increased A LOT in the 2.5 months between 4.5 and 4.6.

They now scored 10/48 = 20.8% in Tier 4 and are in second place behind GPT-5.2 Pro.

Gemini 3 Pro GA/Deep Think, where are you? Time to show what you can do!

06.02.2026 19:03 β€” πŸ‘ 7    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

I think ARC-AGI-2 illustrates the situation nicely. Many are now surprised LLMs exceeded human average baselines so fast. But the raw intelligence was already there long time ago. It just needed to be unlocked by scaling up the capabilities for tracking many small details in the context and such.

06.02.2026 18:42 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

My feeling is that, of all the AI lab leaders, Amodei is probably the most direct in saying what he actually believes.

But all of them need to be careful what they say. They need to consider marketing, funding, regulation, and not to freak people out.

06.02.2026 18:27 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
GPT-5 lowers the cost of cell-free protein synthesis An autonomous lab combining OpenAI’s GPT-5 with Ginkgo Bioworks’ cloud automation cut cell-free protein synthesis costs by 40% through closed-loop experimentation.

For physical work, the next step seems to be automated labs, like in this case OpenAI just published. Altman recently stated they are going "back and forth a lot about whether we should be building automated wet labs for every field".

General purpose (humanoid) robots comes after that.

06.02.2026 18:20 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

It's harder to predict when that would lead to fast takeoff or accelerating beyond our control, as it depends e.g. on availability of computing power, political decisions and other hard to predict developments.

So I think the smarts for that are there much sooner than other necessary resources.

06.02.2026 15:36 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

@pekka is following 20 prominent accounts