Mathieu Acher's Avatar

Mathieu Acher

@macher.bsky.social

Chess-loving professor and researcher who champion the integration of software engineering and AI for reproducible science. Diving deep into software variability spaces, from Airbus to Linux. @rennesuniv.bsky.social #INSA #IUF @InstUnivFr @Inria #IRISA

80 Followers  |  214 Following  |  45 Posts  |  Joined: 22.01.2025
Posts Following

Posts by Mathieu Acher (@macher.bsky.social)

TeXCCChess: How Coding Agents Wrote a Chess Engine in Pure TeX What happens when you ask a 2026 coding agent like Claude Code to build a chess engine from scratch (with no plan, no architecture document, no step-by-step guidance) in a language that was never desi...

Blog post: blog.mathieuacher.com/TeXCCChessEn...
Github: github.com/acherm/agent...
Overleaf: overleaf.com/docs?snip_ur...

I have co-dev other engines in Rust, C++, COBOL, Rocq, Lean, and Brainfuck. And others!
I will present each of them in the next days blog.mathieuacher.com/FromScratchC...

27.02.2026 13:44 — 👍 0    🔁 0    💬 0    📌 0
Post image Post image

I asked a coding agent to build a chess engine from scratch in… LaTeX. Incredible: ~2100 lines of TeX, negamax (depth ~3), ~1280 Elo. A real software tour de force: board/state in macros+registers, logic via expansions. Never been done before Overleaf+github+blog⤵️ #TeXCCChess

27.02.2026 13:44 — 👍 3    🔁 1    💬 1    📌 0
Preview
Le bug de l'an 2038 Le 19 janvier 2038 à 3 h 14 min et 7 s UTC, un bug pire que celui de l’an 2000 menace de faire planter des milliards d’équipements dans le monde – trains, IRM, implants, télécoms, distributeurs… Le...

Un article à lire dans Epsiloon – Le bug de l'an 2038 www.epsiloon.com/tous-les-num...

25.02.2026 17:47 — 👍 2    🔁 0    💬 0    📌 0
La Parole aux Machines (Philosophie des grands modèles de langage) par Monsieur Phi “La Parole aux Machines” est un livre d’utilité publique. L’ouvrage permet de comprendre l’intelligence artificielle générative et les grands modèles de langage (aka LLM) en amenant le lecteur vers d...

"Ce livre est au bon niveau de détail et d’abstraction, loin de la médiocrité des ignorants et fainéants qui pullulent dans l’espace médiatique. C’est un livre par ailleurs très bien écrit"

Passionnante recension du livre de M. Phi par @macher.bsky.social ⤵️
blog.mathieuacher.com/LaParoleAuxM...

24.11.2025 15:22 — 👍 2    🔁 1    💬 0    📌 0
Perroquet (ChatGPT) vs Poisson (Stockfish)... Qui gagne ? Tester les IA par les échecs
YouTube video by macherm Perroquet (ChatGPT) vs Poisson (Stockfish)... Qui gagne ? Tester les IA par les échecs

🦜 vs 🐟 sur l’échiquier : ChatGPT ou Stockfish, qui gagne et pourquoi ?
Mon exposé à la Fête de la Science @INSA_Rennes disponible sur Youtube, avec quelques excellentes questions d'élèves:
youtu.be/TtGiT-tWdmE
C'est ludique mais technique, accessible aux curieux/connaisseurs.
#ChessEveryWhere

14.10.2025 17:01 — 👍 0    🔁 0    💬 0    📌 0
Preview
Perroquet (ChatGPT) vs poisson (Stockfish) Comme les scientifiques qui observent les animaux, nous allons étudier deux intelligences artificielles en les faisant jouer aux échecs. ChatGPT, le "perroquet", et Stockfish, le "poisson", ont chacun...

Perroquet (ChatGPT) vs Poisson (Stockfish)
Qui gagne ? Tester l’intelligence artificielle par les échecs.

Retrouvez moi au Village des sciences à l'INSA Rennes le jeudi 9 octobre
Plus d'information ici: www.fetedelascience.fr/perroquet-ch...

03.10.2025 05:20 — 👍 1    🔁 0    💬 0    📌 0
Post image

I presented "Teaching Reproducibility and Embracing Variability: From Floating-Point Experiments to Replicating Research" at ACM REP conference 2025 .
Blog post with links to preprint, slides, and raw transcript: blog.mathieuacher.com/TeachingRepr...

31.07.2025 09:30 — 👍 0    🔁 0    💬 0    📌 0
General-Purpose AI in the Endgame: The Chess Limitations of o3/o4-mini o3 and o4-mini are large language models recently realeased by OpenAI and augmented with chain-of-thought reinforcement learning, designed to “think before they speak” by generating explicit, multi-st...

Blog post: blog.mathieuacher.com/GPTReasoning...
Code: github.com/acherm/gptch...
with deeper insights, such as:
* o3 can sometimes synthesize code to play chess, but fails
* o3-high seems a special beast, but it is an unreliable model (illegal move may occur after 10 moves) and 15$ for a game!

26.06.2025 15:31 — 👍 1    🔁 0    💬 0    📌 0
Post image

The latest generation of reasoning LLMs perform worse at #Chess compared to previous models. o3 & o4‑mini vs weak Stockfish: illegal moves in 88% & 94% of 67 games. o3 breaks rules in 4 moves; both resigned while winning. Worse than GPT‑3.5‑turbo‑instruct (1750 Elo)

26.06.2025 15:31 — 👍 1    🔁 0    💬 1    📌 0

#KubeDiagrams just crossed 1K stars on Github. It allows generating Kubernetes architecture diagrams from Kubernetes yaml files (among others). Developed by Philippe Merle from
Inria Spirals team
.

09.06.2025 11:11 — 👍 1    🔁 0    💬 0    📌 0
Post image

Un élément nouveau de la vidéo #Devoxx concerne ce comportement étrange de gpt-3.5-turbo-instruct. A voir s'il est possible de reproduire ;) Assez lié à une autre série d'expériences où j'ai montré comment gagner en 4 ou 7 coups de manière systématique blog.mathieuacher.com/ChessWinning... 3/3

10.05.2025 21:34 — 👍 0    🔁 0    💬 0    📌 0

Les deux vidéos sur Youtube:
- #Devoxx www.youtube.com/watch?v=bO96...
- la vidéo originale www.youtube.com/watch?v=6D1X... qui est plus longue et a le temps de (notamment) expliquer mes expériences
blog.mathieuacher.com/GPTsChessElo... 2/3

10.05.2025 21:34 — 👍 0    🔁 0    💬 1    📌 0

Les LLM rêvent-ils de cavaliers électriques ? - Thibaut Giraud @monsieurphi.bsky.social at @devoxx.fr
Une variante de l'excellent "ChatGPT rêve-t-il de cavaliers électriques ?" avec quelques éléments nouveaux.
Quelques pointeurs pour approfondir le sujet #echecs+LLM dans le thread 1/3

10.05.2025 21:34 — 👍 2    🔁 1    💬 1    📌 0
Post image

Nous continuons avec Monsieur Phi “les LLM rêvent-ils de cavaliers électriques ?”

18.04.2025 07:39 — 👍 3    🔁 1    💬 0    📌 0

Interesting comment on my blog post about Stockfish and our study disq.us/p/32uips9

09.05.2025 14:55 — 👍 0    🔁 0    💬 0    📌 0
Preview
President Macron Highlights Software Heritage at the Sorbonne: A Call for Europe to Embrace its Role in a Global Mission Europe's latest science agenda just featured Software Heritage. During the 'Choose Europe for Science' launch at the Sorbonne, President Emmanuel Macron cited our global archive: He delivered a powerf...

President Macron Highlights Software Heritage at the Sorbonne: A Call for Europe to Embrace its Role in a Global Mission www.linkedin.com/pulse/presid...
Highly related to CodeCommons codecommons.org that aims to provide open, responsible, and transparent AI on top of Software Heritage. Let's go!

07.05.2025 18:08 — 👍 1    🔁 0    💬 0    📌 0
Post image

Real position coming from an online real game in #Chess960 I just played. Is it a draw? -0.3 according to Stockfish, but no clear plan. Chess engines are notoriously bad at resolving/assessing fortress-like position. But is it such a case? What do you think? #ChessEveryWhere

30.04.2025 12:18 — 👍 0    🔁 0    💬 0    📌 0
Creative Collaboration with AI: Insights from Hugo Duminil-Copin on Mathematics and Discovery I recently watched a great interview of the mathematician and Fields medalist (2022) Hugo Duminil-Copin by Science étonnante (aka David Louapre). At some point, there was an interesting discussion on ...

Blog post: blog.mathieuacher.com/AIScientific...
Video:
www.youtube.com/watch?v=N_3F...

17.04.2025 14:23 — 👍 1    🔁 0    💬 0    📌 0

Just watched a great interview with Fields medalist Hugo Duminil-Copin by @scienceetonnante.com. At some point, there's a discussion on the role of AI in discovery. Hugo sees AI as a partner -- amplifying our approximations, filling gaps, sparking ideas. Blog post+transcript below 1/2🧵

17.04.2025 14:23 — 👍 1    🔁 0    💬 1    📌 0

Nice talk of Pierre L'Ecuyer about (parallel) random number generation for the 50th anniversary of
@irisa-lab.bsky.social lab last week. Slides are here: www-labs.iro.umontreal.ca/~lecuyer/myf...

14.04.2025 11:23 — 👍 0    🔁 0    💬 0    📌 0

I like the simple examples given throughout the talk that give an intuition of the complexity problems. The kinds of issues mentioned are not necessarily new, but are very well articulated.

02.04.2025 08:56 — 👍 0    🔁 0    💬 0    📌 0
Why Can't We Make Simple Software? - Peter van Hardenberg
YouTube video by Handmade Cities Why Can't We Make Simple Software? - Peter van Hardenberg

Why Can't We Make Simple Software? Great talk by Peter van Hardenberg about complexity in software engineering (robustness and generalization to inputs, the effects of scale, leaky abstractions, variability and combinatorial explosion, dependencies hell, etc)
www.youtube.com/watch?v=czzA...

02.04.2025 08:56 — 👍 2    🔁 0    💬 1    📌 0
Inria 18 mars 2025 : "Comment favoriser la mixité dans les métiers du numérique ?"
YouTube video by DiverSE Team Inria 18 mars 2025 : "Comment favoriser la mixité dans les métiers du numérique ?"

Comment favoriser la mixité dans les métiers du numérique ?
Excellente intervention de Mélissa Cottin, directrice de l'association ESTIMnumérique
youtu.be/w5vzyoH7JM0

24.03.2025 12:53 — 👍 0    🔁 0    💬 0    📌 0
Re-evaluating Metamorphic Testing of Chess Engines: A Replication Study Context: This study aims to confirm, replicate and extend the findings of a previous article entitled ”Metamorphic Testing of Chess Engines” that reported inconsistencies in the analyses provided by S...

Final thoughts?

✅ Reproducibility matters—always verify results.
✅ Replicability matters even more.
✅ Depth sensitivity and domain specificities are critical in SE.
✅ MT needs refinement.
Study:
hal.science/hal-04943474v2
(published at IST journal)
Blog post: blog.mathieuacher.com/Reproducibil...

20.03.2025 10:41 — 👍 0    🔁 0    💬 0    📌 0

A call to refine, not dismiss.

MT is powerful & could work well for LLM-based chess engines. But for Stockfish, MRs must account for depth & move ordering.

20.03.2025 10:41 — 👍 0    🔁 0    💬 1    📌 0

Key takeaway?

🚨 The original study didn't parameterize metamorphic relations by depth!
Metamorphic testing (MT) needs depth-aware refinement—some violations at low depth have limited interest.
No impact on Stockfish depsite alarming claims

20.03.2025 10:41 — 👍 0    🔁 0    💬 1    📌 0

Can we fix this? Yes!

We found where this happens exactly in the code. Symmetry can be enforced, but… it adds overhead/complexity.

20.03.2025 10:41 — 👍 0    🔁 0    💬 1    📌 0

The culprit? Move ordering.

Stockfish orders legal moves differently depending on board symmetry. This affects search results at some depths.

❌ Not a bug, a feature of how the engine explores positions.

20.03.2025 10:41 — 👍 0    🔁 0    💬 1    📌 0
Post image

🔎 A Chess Mystery

These mirrored positions should have the same evaluation, but at depth=20:
📊 Left: +0.66
📊 Right: -2.17
This is not just a low-depth issue—it rings a bell.

20.03.2025 10:41 — 👍 0    🔁 0    💬 1    📌 0

Reproducibility first!

We replicated the study & confirmed MR violations at depth=10. Then we tested:
✔️ Higher depths (15, 20, beyond)
✔️ Realistic positions
✔️ Different Stockfish versions

20.03.2025 10:41 — 👍 0    🔁 0    💬 1    📌 0