Michiel Bontenbal's Avatar

Michiel Bontenbal

@mpbontenbal.bsky.social

Complex beings as us humans can not be summarised in a few lines but I am here for #AI #climate #EUtech #EU_politics. Lecturer in AI & IT. Some posts in Dutch. πŸ‡ͺπŸ‡Ί

1,832 Followers  |  437 Following  |  1,046 Posts  |  Joined: 30.11.2023  |  2.1625

Latest posts by mpbontenbal.bsky.social on Bluesky

Post image Post image Post image Post image

accidentally caused Nano Banana's system prompt to leak, but in a fun way

06.10.2025 03:15 β€” πŸ‘ 84    πŸ” 11    πŸ’¬ 6    πŸ“Œ 4
Preview
Gemma 3n fully available in the open-source ecosystem! We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co/blog/gemma3n

06.10.2025 22:08 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Do AI Models Perform Human-like Abstract Reasoning Across Modalities? OpenAI's o3-preview reasoning model exceeded human accuracy on the ARC-AGI benchmark, but does that mean state-of-the-art models recognize and reason with the abstractions that the task creators inten...

Do AI reasoning models abstract and reason like humans?

New paper on this from my group:

arxiv.org/abs/2510.02125

🧡 1/10

06.10.2025 21:27 β€” πŸ‘ 77    πŸ” 22    πŸ’¬ 3    πŸ“Œ 1

Seems to me a great pelican, if not the best, but yeah 6 min is too much.

06.10.2025 20:06 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Performing great on the benchmark says little about model quality.

06.10.2025 19:53 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

We all have one, but are too ashamed to admit it.

My excuse is 'We do it for the kids'.

06.10.2025 05:28 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 2    πŸ“Œ 0

Lees ook β€˜Active Measures’ van Thomas Rid over de 100+ jaar ervaring van de KGB.

05.10.2025 17:53 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Sorry, that was too quick. Including the tail it was 3 hours!

05.10.2025 15:01 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

The demonstration passed my house. The red line demo lasted 2, 5 hours!

05.10.2025 14:44 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
AI Agents with MCP Since its release in late 2024, Anthropic's Model Context Protocol (MCP) has redefined how developers build and connect AI agents to tools, data, and each other. AI Agents with MCP... - Selection from AI Agents with MCP [Book]

and the addition of chapter 5, the first of 3 chapters covering the ins and outs of building MCP servers. If you subscribe to @oreilly.bsky.social's learning platform, you can spend your weekend with the book now here: learning.oreilly.com/library/vie...

03.10.2025 23:00 β€” πŸ‘ 5    πŸ” 2    πŸ’¬ 1    πŸ“Œ 0
Post image

I haven't posted about Model Spec's in a while, but Dean gave me a shoutout on my earlier writing on them, so its time to say definitively again that every frontier lab should have a model spec. It builds long term trust with users, developers and regulators.

02.10.2025 16:03 β€” πŸ‘ 9    πŸ” 2    πŸ’¬ 1    πŸ“Œ 0

Yes agree that would be nice addition to this figure!

02.10.2025 10:59 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

Every week there are new models, but to get further in AI as a community we need open source models, that also disclose the data they are trained on.

#learnAI

02.10.2025 07:42 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Given the β€˜strategic autonomy’ debates here in πŸ‡ͺπŸ‡Ί, I do not see a full takeover happen.

01.10.2025 21:40 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

We're announcing a new update to MTEB: RTEB

It's a new multilingual text embedding retrieval benchmark with private (!) datasets, to ensure that we measure true generalization and avoid (accidental) overfitting.

Details in our blogpost below 🧡

01.10.2025 15:51 β€” πŸ‘ 4    πŸ” 3    πŸ’¬ 1    πŸ“Œ 1
Video thumbnail

OpenAI employees are very excited about how well their new AI tool can create fake videos of people doing crimes and have definitely thought through all the implications of this

30.09.2025 23:24 β€” πŸ‘ 10804    πŸ” 3296    πŸ’¬ 220    πŸ“Œ 596
An experimental new way to design software
YouTube video by Anthropic An experimental new way to design software

Imagine with Claude

generative UI? like for real? as the user clicks buttons the UI manifests itself 🀯

youtu.be/dGiqrsv530Y

29.09.2025 18:10 β€” πŸ‘ 12    πŸ” 1    πŸ’¬ 2    πŸ“Œ 3

if my coding agent is the example... then my commerce agent will buy all sorts of stuff that I do not need... ;-)

29.09.2025 18:34 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Claude Sonnet 4.5 is probably the β€œbest coding model in the world” (at least for now) Anthropic released Claude Sonnet 4.5 today, with a very bold set of claims: Claude Sonnet 4.5 is the best coding model in the world. It’s the strongest model for building …

Wrote up my initial impressions of the brand new Claude Sonnet 4.5 - I think it may live up to Anthropic's claims of being the "best coding model in the world", for the next few weeks at least!
simonwillison.net/2025/Sep/29/...

29.09.2025 18:13 β€” πŸ‘ 161    πŸ” 27    πŸ’¬ 9    πŸ“Œ 2

he will not leave. not by himself.

29.09.2025 13:15 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

In a future of AGI, LLM's have their role. It might be an important role, but LLM alone are not enough. (Same with self-driving cars, CNN or ViT alone are not enough, but are one part of the solution).

29.09.2025 07:14 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

En de raad van state is niet politiek, dus kan zich niet verdedigen. Ze weet dat ze feitenvrij wat kan roepen.

29.09.2025 06:05 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Artikel van @marchijink.bsky.social.

29.09.2025 05:35 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Groter is niet altijd beter in de wereld van AI OpenAI gooit met gigawatts en strooit met biljoenen, om de volgende versie van ChatGPT nΓ³g slimmer te maken. Maar groter is niet altijd beter.

Groter is niet altijd beter in de wereld van AI www.nrc.nl/nieuws/2025/... (10x gratis). Met @maartengr.bsky.social )

29.09.2025 05:23 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image Post image

Want to visualize the response format constraints on the LLM when working in a Jupyter notebook?
Then you might be interested in my new project `litelines`.
Litelines lets you visualize the selected path by the LLM.
It supports a Pydantic schema as a response format, as well as regular expressions.

16.09.2025 07:18 β€” πŸ‘ 8    πŸ” 4    πŸ’¬ 1    πŸ“Œ 2

I believe Sutton is right. There is a lot of value in LLM's and many other ML architectures, but it falls short of true intelligence.

28.09.2025 15:46 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Mooie oefening voor mijn it studenten: stel eisen op voor de btw berekening van 1) een losse appel 2) een voorverpakte salade waar ook kip en croutoms in zitten. 3) een broodje met gegrilde groentes. Succes!

28.09.2025 07:54 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

πŸ‡ͺπŸ‡ΊπŸ‡ͺπŸ‡ΊπŸ‡ͺπŸ‡Ί

27.09.2025 22:12 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

And new paper out: Pleias 1.0: the First Family of Language Models Trained on Fully Open Data

How we train an open everything model on a new pretraining environment with releasable data (Common Corpus) with an open source framework (Nanotron from HuggingFace).

www.sciencedirect.com/science/arti...

27.09.2025 11:44 β€” πŸ‘ 171    πŸ” 51    πŸ’¬ 8    πŸ“Œ 8

@berthub.eu

27.09.2025 12:15 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

@mpbontenbal is following 20 prominent accounts