Mitch Allen's Avatar

Mitch Allen

@mitchallen.bsky.social

- Full-Stack Engineer (AI Platform) - https://mitchallen.com

294 Followers  |  753 Following  |  191 Posts  |  Joined: 15.11.2024
Posts Following

Posts by Mitch Allen (@mitchallen.bsky.social)

NASA chief Jared Isaacman discusses major changes to Artemis program to get it "back on track"
YouTube video by CBS News NASA chief Jared Isaacman discusses major changes to Artemis program to get it "back on track"

NASA finally gets back to iterating. #naa #artemis

youtube.com/watch?v=8VwR...

01.03.2026 18:17 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Salesforces mistake
YouTube video by The PrimeTime Salesforces mistake

This is what happens when CEOs buy the hype.

youtube.com/shorts/tBWen...

14.02.2026 10:31 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
GitHub - openai/codex: Lightweight coding agent that runs in your terminal Lightweight coding agent that runs in your terminal - openai/codex

Claude Code with their stingy token allocation put me on a timeout again. That frees me up to consider Open AI Codex as an alternative.

#codex #claude #anthropic #openai

github.com/openai/codex

13.02.2026 13:23 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Nearly 21K new AI agents go live on Ethereum, BNB Chain, and Solana - Cryptopolitan AI agents are proliferating under the new ERC-8004 frameworks. This new type of agent can work in a wider environment, while being vetted by reputation, through ZK proofs, or other predetermined condi...

Nearly 21K new AI agents go live on Ethereum, BNB Chain, and Solana

www.cryptopolitan.com/nearly-21k-n...

13.02.2026 11:16 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Who will acquire OpenClaw? - OpenAI and Meta make big offers | Peter Steinberger and Lex Fridman
YouTube video by Lex Clips Who will acquire OpenClaw? - OpenAI and Meta make big offers | Peter Steinberger and Lex Fridman

The guy just wants to nerd out: Peter Steinberger is the creator of OpenClaw, an open-source AI agent framework that's the fastest-growing project in GitHub history.

youtube.com/watch?v=NMBo...

13.02.2026 03:41 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
How OpenClaw Works: The Architecture Behind the 'Magic'
YouTube video by Damian Galarza How OpenClaw Works: The Architecture Behind the 'Magic'

Here is a breakdown of how OpenClaw works.

youtube.com/watch?v=CAbr...

12.02.2026 03:32 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
A large comparison table showing benchmark performance across five model families, with columns labeled at the top: β€œOpus 4.6,” β€œOpus 4.5,” β€œSonnet 4.5,” β€œGemini 3 Pro,” and β€œGPT-5.2 (all models).” The Opus 4.6 column is visually highlighted with a light shaded background and rounded border.

Rows list tasks and benchmarks on the left, with percentages or scores across models:

β€œAgentic terminal coding (Terminal-Bench 2.0)”:
Opus 4.6: 65.4%
Opus 4.5: 59.8%
Sonnet 4.5: 51.0%
Gemini 3 Pro: 56.2% (54.2% self-reported)
GPT-5.2: 64.7% (64% self-reported, Codex CLI)

β€œAgentic coding (SWE-bench Verified)”:
Opus 4.6: 80.8%
Opus 4.5: 80.9%
Sonnet 4.5: 77.2%
Gemini 3 Pro: 76.2%
GPT-5.2: 80.0%

β€œAgentic computer use (OSWorld)”:
Opus 4.6: 72.7%
Opus 4.5: 66.3%
Sonnet 4.5: 61.4%
Gemini 3 Pro: β€”
GPT-5.2: β€”

β€œAgentic tool use (t2-bench)”:
Retail: Opus 4.6 91.9%, Opus 4.5 88.9%, Sonnet 4.5 86.2%, Gemini 3 Pro 85.3%, GPT-5.2 82.0%
Telecom: Opus 4.6 99.3%, Opus 4.5 98.2%, Sonnet 4.5 98.0%, Gemini 3 Pro 98.0%, GPT-5.2 98.7%

β€œScaled tool use (MCP Atlas)”:
Opus 4.6: 59.5%
Opus 4.5: 62.3%
Sonnet 4.5: 43.8%
Gemini 3 Pro: 54.1%
GPT-5.2: 60.6%

β€œAgentic search (BrowseComp)”:
Opus 4.6: 84.0%
Opus 4.5: 67.8%
Sonnet 4.5: 43.9%
Gemini 3 Pro: 59.2% (Deep Research)
GPT-5.2: 77.9% (Pro)

β€œMultidisciplinary reasoning (Humanity’s Last Exam)”:
Without tools: Opus 4.6 40.0%, Opus 4.5 30.8%, Sonnet 4.5 17.7%, Gemini 3 Pro 37.5%, GPT-5.2 36.6%
With tools: Opus 4.6 53.1%, Opus 4.5 43.4%, Sonnet 4.5 33.6%, Gemini 3 Pro 45.8%, GPT-5.2 50.0%

β€œAgentic financial analysis (Finance Agent)”:
Opus 4.6: 60.7%
Opus 4.5: 55.9%
Sonnet 4.5: 54.2%
Gemini 3 Pro: 44.1%
GPT-5.2: 56.6% (5.1)

β€œOffice tasks (GDPVal-AA Elo)”:
Opus 4.6: 1606
Opus 4.5: 1416
Sonnet 4.5: 1277
Gemini 3 Pro: 1195
GPT-5.2: 1462

β€œNovel problem-solving (ARC AGI 2)”:
Opus 4.6: 68.8%
Opus 4.5: 37.6%
Sonnet 4.5: 13.6%
Gemini 3 Pro: 45.1% (Deep Thinking)
GPT-5.2: 54.2% (Pro)

β€œGraduate-level reasoning (GPQA Diamond)”:
Opus 4.6: 91.3%
Opus 4.5: 87.0%
S…

A large comparison table showing benchmark performance across five model families, with columns labeled at the top: β€œOpus 4.6,” β€œOpus 4.5,” β€œSonnet 4.5,” β€œGemini 3 Pro,” and β€œGPT-5.2 (all models).” The Opus 4.6 column is visually highlighted with a light shaded background and rounded border. Rows list tasks and benchmarks on the left, with percentages or scores across models: β€œAgentic terminal coding (Terminal-Bench 2.0)”: Opus 4.6: 65.4% Opus 4.5: 59.8% Sonnet 4.5: 51.0% Gemini 3 Pro: 56.2% (54.2% self-reported) GPT-5.2: 64.7% (64% self-reported, Codex CLI) β€œAgentic coding (SWE-bench Verified)”: Opus 4.6: 80.8% Opus 4.5: 80.9% Sonnet 4.5: 77.2% Gemini 3 Pro: 76.2% GPT-5.2: 80.0% β€œAgentic computer use (OSWorld)”: Opus 4.6: 72.7% Opus 4.5: 66.3% Sonnet 4.5: 61.4% Gemini 3 Pro: β€” GPT-5.2: β€” β€œAgentic tool use (t2-bench)”: Retail: Opus 4.6 91.9%, Opus 4.5 88.9%, Sonnet 4.5 86.2%, Gemini 3 Pro 85.3%, GPT-5.2 82.0% Telecom: Opus 4.6 99.3%, Opus 4.5 98.2%, Sonnet 4.5 98.0%, Gemini 3 Pro 98.0%, GPT-5.2 98.7% β€œScaled tool use (MCP Atlas)”: Opus 4.6: 59.5% Opus 4.5: 62.3% Sonnet 4.5: 43.8% Gemini 3 Pro: 54.1% GPT-5.2: 60.6% β€œAgentic search (BrowseComp)”: Opus 4.6: 84.0% Opus 4.5: 67.8% Sonnet 4.5: 43.9% Gemini 3 Pro: 59.2% (Deep Research) GPT-5.2: 77.9% (Pro) β€œMultidisciplinary reasoning (Humanity’s Last Exam)”: Without tools: Opus 4.6 40.0%, Opus 4.5 30.8%, Sonnet 4.5 17.7%, Gemini 3 Pro 37.5%, GPT-5.2 36.6% With tools: Opus 4.6 53.1%, Opus 4.5 43.4%, Sonnet 4.5 33.6%, Gemini 3 Pro 45.8%, GPT-5.2 50.0% β€œAgentic financial analysis (Finance Agent)”: Opus 4.6: 60.7% Opus 4.5: 55.9% Sonnet 4.5: 54.2% Gemini 3 Pro: 44.1% GPT-5.2: 56.6% (5.1) β€œOffice tasks (GDPVal-AA Elo)”: Opus 4.6: 1606 Opus 4.5: 1416 Sonnet 4.5: 1277 Gemini 3 Pro: 1195 GPT-5.2: 1462 β€œNovel problem-solving (ARC AGI 2)”: Opus 4.6: 68.8% Opus 4.5: 37.6% Sonnet 4.5: 13.6% Gemini 3 Pro: 45.1% (Deep Thinking) GPT-5.2: 54.2% (Pro) β€œGraduate-level reasoning (GPQA Diamond)”: Opus 4.6: 91.3% Opus 4.5: 87.0% S…

Opus 4.6 is here!

biggest wins on agentic search, HLE & ARC AGI 2

claude.com/blog/opus-4-...

05.02.2026 18:02 β€” πŸ‘ 88    πŸ” 7    πŸ’¬ 5    πŸ“Œ 3
Preview
β€˜Get me out’: Traders dump software stocks as AI fears erupt β€œWe call it the β€˜SaaSpocalypse,’ an apocalypse for software-as-a-service stocks,” said Jeffrey Favuzza, who works on the equity trading desk at Jefferies. Selling pressure was evident across the sect...

finance.yahoo.com/news/traders...

#SaaSpocalypse #SaaS #Anthropic #Claude

03.02.2026 23:39 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
The Moltbook Experiment Failed
YouTube video by The PrimeTime The Moltbook Experiment Failed

youtube.com/watch?v=6OXE...

#moltbook #openclaw #moltbot #clawdbot

03.02.2026 15:01 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

Oh here we go again. Apple torpedoed one of my apps again and never bothered to tell me why. I was thinking of closing my account since I have no time for updates anyway.

#apple #indiedev

29.01.2026 03:12 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Hundreds of Clawdbot instances were exposed on the internet. Here’s how to not be one of them A follow-up guide covering the security risks, best practices, and hardening steps for running an AI assistant with access to your personal…

Before installing ClawdBot, make sure you’ve mitigated the risks. One way to shore things up is to use Slack or Signal for messaging. I wish all the automation gurus would stop recommending Telegram.

#clawdbot #agentic #telegram #ollama #security #automation

jpcaparas.medium.com/hundreds-of-...

27.01.2026 11:34 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Skool: Sign up Create your Skool account. It's free!

Skool has a $9/month hobby plan for building online communities. Since I’m a β€œweb guy” I get asked how to do something like that all the time. If you want to build a site where you can host courses, discussions etc you can checkout my affiliate link below. #skool

www.skool.com/signup?ref=5...

24.01.2026 04:25 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

I’m trying to use Gemini CLI at work under their enterprise license. Every prompt is met with β€œTrying to reach gemini-2.5-pro (Attempt X/10)”

Then sometimes it reaches 10 and I have to tell it to try the flash version.

At least I get paid by the hour to stare at my screen.

#gemini

22.01.2026 19:52 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Claude devs complain about surprise usage limits : Holiday hangover?

Anthropic has been alienating a lot of developers lately. Including myself with their token limits, daily and even weekly timeouts.

#anthropic #claudecode #claude #opencode #aidev

www.theregister.com/2026/01/05/c...

15.01.2026 14:51 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

With 3 agents running in the background I always hit my Claude Code token limit in 15 minutes.

This is unsustainable.

I’ll be building up my home lab this year to move Ollama to a beefy server and shift my work there.

Then see what I can do until open source catches up.

#ollama #claudecode

09.01.2026 13:24 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
From Llamas to Avocados: Meta's shifting AI strategy is causing internal confusion Meta’s push to develop its next frontier model, codenamed Avocado, under new AI leadership is creating internal friction as it races rivals OpenAI and Google.

Meta bails on open source AI.

#meta #llama #avocado #opensource #closedsource

www.cnbc.com/amp/2025/12/...

14.12.2025 11:21 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Of course there is the danger that the review itself is a hallucination. If I were really worried I'd have another agent do a second review.

09.12.2025 04:20 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

To avoid posting quotes that were hallucinated by one AI agent, I've built another to review each quote and assign a status, along with an explanation. In this case, the agent found that the original attribution was misidentified.

#aiagents #workflow #verification #citation #n8n #anthropic #gemini

09.12.2025 04:20 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Photo of a magic 8 ball with a 'goatse' version of the OpenAI logo and the words ChatGPT on it, next to a white box that says "ChatGPT offline version". Inset are photos of the 8-ball responses such as "That's a great question", "You're 1000% right" and "Too many requests try again later" and then another photo from the back of the box of a holographic authenticity sticker that says FAKE.

Photo of a magic 8 ball with a 'goatse' version of the OpenAI logo and the words ChatGPT on it, next to a white box that says "ChatGPT offline version". Inset are photos of the 8-ball responses such as "That's a great question", "You're 1000% right" and "Too many requests try again later" and then another photo from the back of the box of a holographic authenticity sticker that says FAKE.

After much research and development I have made an offline version of ChatGPT.

Now you can save water and electricity while navel-gazing, and carry one of the world's most powerfully annoying AI chatbots in your pocket.

05.12.2025 16:02 β€” πŸ‘ 3162    πŸ” 1169    πŸ’¬ 37    πŸ“Œ 41
Post image

An example of using an AI agent fallback model in n8n. I've exceeded my limit using Gemini, so the model falls back to Anthropic (Claude).

#n8n #gemini #anthropic #claude #slackbot #workflow #automation

07.12.2025 12:19 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

My Gemini-driven Slack engineering quote bot experiment has a pre-occupation with people named Vance. I’ll have to add quality control to the workflow or prompt.

#gemini #slack #n8n #automation #workflow #slackbot

06.12.2025 15:27 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Video thumbnail

#midnight #catvideos #funny #humor #genai #cat

21.11.2025 04:05 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Video thumbnail

I generated a quick SQA explainer video clip using Veo. Even with an open ended prompt the holistic view came through.

#veo #sqa #genai #explainer #sdet

09.11.2025 02:59 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Video thumbnail

Experimenting with Veo 3.1. The problem is that an open ended prompt - "explain what an AI Engineer does" - left the door open to an over the top explanation. Or this is the first sign of AI's planned takeover.

#veo3 #gemini #vertexai #videogen #genai

06.11.2025 04:19 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Broken Peach - Enter Sandman (Halloween Special)
YouTube video by Broken Peach Broken Peach - Enter Sandman (Halloween Special)

youtube.com/watch?v=MI8q...

#brokenpeach #halloween #rock

17.10.2025 10:24 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
SoftBank to buy ABB's robotics business in $5.4 billion deal ZURICH (Reuters) -SoftBank Group said on Wednesday it has agreed to buy the robotics business of Swiss engineering group ABB in a $5.4 billion deal. The deal marks a major push by SoftBank founder an...

SoftBank to buy ABB's robotics business in $5.4 billion deal

finance.yahoo.com/news/abb-sel...

#SoftBank #ABB #robotics #automation #tech #investment

08.10.2025 06:53 β€” πŸ‘ 3    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0
Preview
The Rise of the AI Engineer How AI is reshaping the role of the modern software developer

substack.com/@mitchallen/...

Today’s developers don’t just write programs. We design systems that think, reason, and extend themselves.

#aiengineer #ai #mcp #indiedev

07.10.2025 12:19 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
AI INTERVIEWS ARE HERE!! SO I TROLLED ONE TO SHOW YOU...
YouTube video by Joshua Fluke AI INTERVIEWS ARE HERE!! SO I TROLLED ONE TO SHOW YOU...

youtube.com/watch?v=Ng_B...

#ai #aiinterviews #chatgpt

28.09.2025 17:50 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image Post image

Using Gemini Nano Banana 🍌 to extract an image from one of my photos.

#wasp #paperwasp #gemini #nanobanana

16.09.2025 08:09 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
After AI Led to Layoffs, Coders Are Being Hired to Fix 'Vibe-Coded' Screwups Fire human, use AI, fire AI, hire human.

gizmodo.com/after-ai-led... #vibecoding

14.09.2025 11:16 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0