I asked a coding agent to build a chess engine from scratch in… LaTeX. Incredible: ~2100 lines of TeX, negamax (depth ~3), ~1280 Elo. A real software tour de force: board/state in macros+registers, logic via expansions. Never been done before Overleaf+github+blog⤵️ #TeXCCChess
27.02.2026 13:44 —
👍 3
🔁 1
💬 1
📌 0
YouTube video by macherm
Perroquet (ChatGPT) vs Poisson (Stockfish)... Qui gagne ? Tester les IA par les échecs
🦜 vs 🐟 sur l’échiquier : ChatGPT ou Stockfish, qui gagne et pourquoi ?
Mon exposé à la Fête de la Science @INSA_Rennes disponible sur Youtube, avec quelques excellentes questions d'élèves:
youtu.be/TtGiT-tWdmE
C'est ludique mais technique, accessible aux curieux/connaisseurs.
#ChessEveryWhere
14.10.2025 17:01 —
👍 0
🔁 0
💬 0
📌 0
I presented "Teaching Reproducibility and Embracing Variability: From Floating-Point Experiments to Replicating Research" at ACM REP conference 2025 .
Blog post with links to preprint, slides, and raw transcript: blog.mathieuacher.com/TeachingRepr...
31.07.2025 09:30 —
👍 0
🔁 0
💬 0
📌 0
The latest generation of reasoning LLMs perform worse at #Chess compared to previous models. o3 & o4‑mini vs weak Stockfish: illegal moves in 88% & 94% of 67 games. o3 breaks rules in 4 moves; both resigned while winning. Worse than GPT‑3.5‑turbo‑instruct (1750 Elo)
26.06.2025 15:31 —
👍 1
🔁 0
💬 1
📌 0
#KubeDiagrams just crossed 1K stars on Github. It allows generating Kubernetes architecture diagrams from Kubernetes yaml files (among others). Developed by Philippe Merle from
Inria Spirals team
.
09.06.2025 11:11 —
👍 1
🔁 0
💬 0
📌 0
Un élément nouveau de la vidéo #Devoxx concerne ce comportement étrange de gpt-3.5-turbo-instruct. A voir s'il est possible de reproduire ;) Assez lié à une autre série d'expériences où j'ai montré comment gagner en 4 ou 7 coups de manière systématique blog.mathieuacher.com/ChessWinning... 3/3
10.05.2025 21:34 —
👍 0
🔁 0
💬 0
📌 0
Les deux vidéos sur Youtube:
- #Devoxx www.youtube.com/watch?v=bO96...
- la vidéo originale www.youtube.com/watch?v=6D1X... qui est plus longue et a le temps de (notamment) expliquer mes expériences
blog.mathieuacher.com/GPTsChessElo... 2/3
10.05.2025 21:34 —
👍 0
🔁 0
💬 1
📌 0
Les LLM rêvent-ils de cavaliers électriques ? - Thibaut Giraud @monsieurphi.bsky.social at @devoxx.fr
Une variante de l'excellent "ChatGPT rêve-t-il de cavaliers électriques ?" avec quelques éléments nouveaux.
Quelques pointeurs pour approfondir le sujet #echecs+LLM dans le thread 1/3
10.05.2025 21:34 —
👍 2
🔁 1
💬 1
📌 0
Nous continuons avec Monsieur Phi “les LLM rêvent-ils de cavaliers électriques ?”
18.04.2025 07:39 —
👍 3
🔁 1
💬 0
📌 0
Interesting comment on my blog post about Stockfish and our study disq.us/p/32uips9
09.05.2025 14:55 —
👍 0
🔁 0
💬 0
📌 0
Real position coming from an online real game in #Chess960 I just played. Is it a draw? -0.3 according to Stockfish, but no clear plan. Chess engines are notoriously bad at resolving/assessing fortress-like position. But is it such a case? What do you think? #ChessEveryWhere
30.04.2025 12:18 —
👍 0
🔁 0
💬 0
📌 0
Just watched a great interview with Fields medalist Hugo Duminil-Copin by @scienceetonnante.com. At some point, there's a discussion on the role of AI in discovery. Hugo sees AI as a partner -- amplifying our approximations, filling gaps, sparking ideas. Blog post+transcript below 1/2🧵
17.04.2025 14:23 —
👍 1
🔁 0
💬 1
📌 0
Nice talk of Pierre L'Ecuyer about (parallel) random number generation for the 50th anniversary of
@irisa-lab.bsky.social lab last week. Slides are here: www-labs.iro.umontreal.ca/~lecuyer/myf...
14.04.2025 11:23 —
👍 0
🔁 0
💬 0
📌 0
I like the simple examples given throughout the talk that give an intuition of the complexity problems. The kinds of issues mentioned are not necessarily new, but are very well articulated.
02.04.2025 08:56 —
👍 0
🔁 0
💬 0
📌 0
YouTube video by Handmade Cities
Why Can't We Make Simple Software? - Peter van Hardenberg
Why Can't We Make Simple Software? Great talk by Peter van Hardenberg about complexity in software engineering (robustness and generalization to inputs, the effects of scale, leaky abstractions, variability and combinatorial explosion, dependencies hell, etc)
www.youtube.com/watch?v=czzA...
02.04.2025 08:56 —
👍 2
🔁 0
💬 1
📌 0
YouTube video by DiverSE Team
Inria 18 mars 2025 : "Comment favoriser la mixité dans les métiers du numérique ?"
Comment favoriser la mixité dans les métiers du numérique ?
Excellente intervention de Mélissa Cottin, directrice de l'association ESTIMnumérique
youtu.be/w5vzyoH7JM0
24.03.2025 12:53 —
👍 0
🔁 0
💬 0
📌 0
Re-evaluating Metamorphic Testing of Chess Engines: A Replication Study
Context: This study aims to confirm, replicate and extend the findings of a previous article entitled ”Metamorphic Testing of Chess Engines” that reported inconsistencies in the analyses provided by S...
Final thoughts?
✅ Reproducibility matters—always verify results.
✅ Replicability matters even more.
✅ Depth sensitivity and domain specificities are critical in SE.
✅ MT needs refinement.
Study:
hal.science/hal-04943474v2
(published at IST journal)
Blog post: blog.mathieuacher.com/Reproducibil...
20.03.2025 10:41 —
👍 0
🔁 0
💬 0
📌 0
A call to refine, not dismiss.
MT is powerful & could work well for LLM-based chess engines. But for Stockfish, MRs must account for depth & move ordering.
20.03.2025 10:41 —
👍 0
🔁 0
💬 1
📌 0
Key takeaway?
🚨 The original study didn't parameterize metamorphic relations by depth!
Metamorphic testing (MT) needs depth-aware refinement—some violations at low depth have limited interest.
No impact on Stockfish depsite alarming claims
20.03.2025 10:41 —
👍 0
🔁 0
💬 1
📌 0
Can we fix this? Yes!
We found where this happens exactly in the code. Symmetry can be enforced, but… it adds overhead/complexity.
20.03.2025 10:41 —
👍 0
🔁 0
💬 1
📌 0
The culprit? Move ordering.
Stockfish orders legal moves differently depending on board symmetry. This affects search results at some depths.
❌ Not a bug, a feature of how the engine explores positions.
20.03.2025 10:41 —
👍 0
🔁 0
💬 1
📌 0
🔎 A Chess Mystery
These mirrored positions should have the same evaluation, but at depth=20:
📊 Left: +0.66
📊 Right: -2.17
This is not just a low-depth issue—it rings a bell.
20.03.2025 10:41 —
👍 0
🔁 0
💬 1
📌 0
Reproducibility first!
We replicated the study & confirmed MR violations at depth=10. Then we tested:
✔️ Higher depths (15, 20, beyond)
✔️ Realistic positions
✔️ Different Stockfish versions
20.03.2025 10:41 —
👍 0
🔁 0
💬 1
📌 0