Gabriel's Avatar

Gabriel

@dssgabriel.bsky.social

PhD candidate, HPC Software Engineering @cea.fr / DAM MSc HPC & Simulation from @univparissaclay.bsky.social Architecture, microbenchmarking & SIMD sorcery. Research on distributed computing, data structures & memory layouts at exascale. RTFM 👹

18 Followers  |  85 Following  |  18 Posts  |  Joined: 01.04.2025  |  2.021

Latest posts by dssgabriel.bsky.social on Bluesky

Division — Matt Godbolt’s blog Division doesn't have to be slow with some clever tricks

Day 6 of Advent of Compiler Optimisations! Divide by 512—just a shift, right? But the compiler adds extra instructions. Why? A subtle difference between what you asked and what you meant!

xania.org/202512/06-di...
youtu.be/7Rtk0qOX9zs

#AoCO2025

06.12.2025 12:49 — 👍 32    🔁 9    💬 1    📌 0
ARM's barrel shifter tricks — Matt Godbolt’s blog The ARM architecture has a cool feature, and compilers know how to use it

Day 5 of Advent of Compiler Optimisations! x86 has LEA, but ARM has the barrel shifter—instructions can shift operands cheaply. The compiler uses this to multiply without multiplying!

xania.org/202512/05-ba...
youtu.be/TZubUyr2UEY

#AoCO2025

05.12.2025 12:05 — 👍 25    🔁 5    💬 2    📌 0

Day 4 of Advent of Compiler Optimisations! Multiply by constants—which ones use actual multiply? The compiler has tricks to avoid it, then saves you from your own clever hacks.

xania.org/202512/04-mu...
youtu.be/1X88od0miHs

#AoCO2025

04.12.2025 12:07 — 👍 46    🔁 9    💬 6    📌 0
NVIDIA Releases CUDA 13.1 With New "CUDA Tile" Programming Model NVIDIA just released CUDA 13.1 for what they claim is "the largest and most comprehensive update to the CUDA platform since it was invented two decades ago." The most notable addition with the CUDA 13.1 release is CUDA Tile as a new tile-based programming model...

NVIDIA Releases CUDA 13.1 With New "CUDA Tile" Programming Model - https://www.phoronix.com/news/NVIDIA-CUDA-13.1

04.12.2025 22:56 — 👍 7    🔁 1    💬 0    📌 0
Preview
Release v0.36.0 · jj-vcs/jj About jj is a Git-compatible version control system that is both simple and powerful. See the installation instructions to get started. Release highlights The documentation has moved from https:/...

#jj-vcs 0.36.0 us out! Including finally moving the documentation to our own domain, docs.jj-vcs.dev

github.com/jj-vcs/jj/re...

04.12.2025 23:41 — 👍 82    🔁 8    💬 0    📌 0
You can't fool the optimiser — Matt Godbolt’s blog Pattern recognition can see through obfuscated code to find the right instruction

Day 3 of Advent of Compiler Optimisations! A while loop, recursion, direct addition—all compile to one instruction. The compiler sees the pattern beneath the code.

xania.org/202512/03-mo...
youtu.be/wHg9lYPMvvE

#AoCO2025

03.12.2025 12:00 — 👍 44    🔁 7    💬 0    📌 0
Addressing the adding situation — Matt Godbolt’s blog We learn why adding on x86 isn't as obvious as you might think

Day 2: Adding two integers on x86? Not with `add`! The compiler uses a completely different instruction—one designed for memory addressing. Why? xania.org/202512/02-ad... youtu.be/BOvg0sGJnes #AoCO2025

02.12.2025 11:28 — 👍 76    🔁 18    💬 4    📌 0
Why xor eax, eax? — Matt Godbolt’s blog Why do compilers love xor-ing registers so much?

Ever wonder why compilers use `xor eax, eax` to zero registers? It's smaller AND faster—CPUs optimise it out entirely!

Day 1 of Advent of Compiler Optimisations: xania.org/202512/01-xo...
Video: youtu.be/eLjZ48gqbyg

#AoCO2025

01.12.2025 12:12 — 👍 133    🔁 25    💬 5    📌 4
Preview
SC'25 recap The annual SC conference was held last week, drawing over 16,000 registrants and 560 exhibitors to in St. Louis, Missouri to talk ab...

I wrote up my notes from #SC25. Have a look: blog.glennklockwood.com/2025/12/sc25...

I’ll keep picking away at the editing, but would love to hear more from others about what stood out to them. I wasn’t at the conference itself as much this years as in the past, so I know I missed a lot.

#HPC

01.12.2025 19:27 — 👍 24    🔁 10    💬 3    📌 3
From Zero to GitHub: Starting A New jj (Jujutsu) Repo I document the steps and missteps that I take setting up a new jujutsu repo and pushing it to GitHub

From Zero to GitHub: Starting A New #jj-vcs Repo

www.visualmode.dev/from-zero-to...

03.12.2025 20:31 — 👍 46    🔁 6    💬 1    📌 0
Post image

 📣 The procurement contract for #AliceRecoque, the new European #exascale supercomputer 🖥️ ⚡ located in #France, has been signed by @eurohpc-ju.bsky.social  and the selected vendor Eviden! It will be one of the backbones for Europe's network of #AIFactories

🔗 www.eurohpc-ju.europa.eu/contract-sig...

18.11.2025 11:01 — 👍 1    🔁 1    💬 0    📌 0
Post image

We’re delighted to welcome Modules to the High Performance Software Foundation as an established project 🎊

Read the announcement ➡️ hpsf.io/blog/2025/hi...

10.11.2025 14:25 — 👍 3    🔁 1    💬 1    📌 0

#AMD #Zen6 znver6 ISA:
- #AVX512_BMM (CPUID.80000021.EAX[23], VBMACOR16x16x16, VBMACXOR16x16x16, VBITREV)
- #AVX512_FP16
- #AVX_NE_CONVERT
- #AVX_IFMA
- #AVX_VNNI_INT8
Source:
sourceware.org/pipermail/bi...

08.11.2025 01:11 — 👍 2    🔁 1    💬 1    📌 1
NASM 3.00 Assembler Is Ready With Intel APX & AVX10 Support Slipping under my radar in October was the release of NASM 3.00 and the follow-up NASM 3.01 release shortly there after. This widely-used open-source assembler is now ready with support for Intel's Advanced Performance Extensions (APX) and AVX10...

NASM 3.00 Assembler Is Ready With Intel APX & AVX10 Support - https://www.phoronix.com/news/NASM-3.00-APX-AVX10

02.11.2025 13:53 — 👍 7    🔁 1    💬 0    📌 0
N3694: Functions with Data - Closures in C (A Comprehensive Proposal Overviewing Blocks, Nested Functions, and Lambdas)

Iiiiiit's published.

www.open-std.org/JTC1/SC22/WG...

02.11.2025 18:58 — 👍 65    🔁 9    💬 5    📌 0
Extrait de l'ordonnance de création du CEA datant du 18 octobre 1945 © Archives Nationales \ Stéphane Méziache

Extrait de l'ordonnance de création du CEA datant du 18 octobre 1945 © Archives Nationales \ Stéphane Méziache

#anniversaire 🎂 | Le CEA fête ses 80 🕯️ .
Le 18/10/1945, le Général de Gaulle signe l’ordonnance fondatrice du CEA.

⚛️L'organisme a pour ambition d’offrir à la 🇫🇷 la maîtrise de l’atome dans différents domaines de la science, de l’industrie & de la défense nationale

👇Extrait

18.10.2025 06:58 — 👍 23    🔁 11    💬 1    📌 0

It means “French” 🥖

17.10.2025 20:50 — 👍 0    🔁 0    💬 0    📌 0
Post image

Colleges do a terrible job of teaching C++.



It’s not “C with Classes”. Injected into curriculums as a demonstration of early CS concepts, it leaves many with a sour taste.



Students later immediately fall in love with the first language that *doesn’t* feel that way.

13.10.2025 21:21 — 👍 57    🔁 5    💬 8    📌 1
Post image

#AMD & #Intel unified future instructions:
#FRED #AVX10 #ChkTag #ACE (=ACE (Advanced Matrix Extensions for Matrix Multiplication): www.amd.com/en/blogs/202...

14.10.2025 07:17 — 👍 9    🔁 6    💬 1    📌 0

September feels like it goes on forever, then october breezes by in an instant

13.10.2025 12:24 — 👍 0    🔁 0    💬 0    📌 0
Post image

HPSF Board Member & Kokkos Project Maintainer, Damien Lebrun-Grandié, of Oak Ridge National Laboratory (ORNL) will share a maintainer's perspective on Sustainable HPC Software in #HPC Best Practices Webinar - Oct 15 at 1:00pm EDT 🔎 Learn more:
ideas-productivity.org/events/hpcbp...

07.10.2025 19:54 — 👍 2    🔁 3    💬 0    📌 0
Preview
Defer: Resource cleanup in C with GCCs magic on OSHub [Defer Macro] “Warning: This is experimental, relies on GCC-specific extensions (__attribute__((cleanup)) and nested functions), and is not portabl...

Defer: Resource cleanup in C with GCCs magic
oshub.org/projects/ret...

01.10.2025 13:14 — 👍 2    🔁 2    💬 0    📌 0
Sguaba: Type-safe spatial math in Rust
YouTube video by Jon Gjengset Sguaba: Type-safe spatial math in Rust

About a month ago, I gave a talk at the Rust Amsterdam meetup about Sguaba (the type-safe spatial math Rust crate), and the recording of that is now online for anyone who wants their head to hurt with frames of reference and coordinate transforms 😅
youtu.be/kESBAiTYMoQ

29.09.2025 14:26 — 👍 25    🔁 4    💬 1    📌 0
Preview
A Look into Intel Xeon 6’s Memory Subsystem Intel’s server dominance has been shaken by high core count competition from the likes of AMD and Arm.

Hello you fine Internet folks,

Today we are taking a look Intel's Xeon 6 Memory Subsystem and the changes that Intel made in order to fit up to 128 cores in a single CPU.

Hope y'all enjoy!

chipsandcheese.com/p/a-look-int...

old.chipsandcheese.com/2025/09/26/a...

26.09.2025 16:44 — 👍 11    🔁 2    💬 0    📌 0
Screenshot of @john_attridge’s tweet (in a sans-serif font) saying “It’s nice but do you have any sans-seraph fonts?” with a photo of a winged angel kneeling, supporting a basin

Screenshot of @john_attridge’s tweet (in a sans-serif font) saying “It’s nice but do you have any sans-seraph fonts?” with a photo of a winged angel kneeling, supporting a basin

This is a font joke.

20.09.2025 20:19 — 👍 338    🔁 45    💬 21    📌 4

Personal achievement: Relentlessly correcting Exaflops and the likes to ExaFLOP/s, and seeing that it slowly sticks, also in articles of the press 💪 #HPC

06.09.2025 08:16 — 👍 2    🔁 1    💬 0    📌 0
Post image Post image Post image Post image

Intel gives some more details on #ClearwaterForest at @hotchipsorg #hc2025.

- wider instruction decoders
- better branch prediction
- massive parallel OoO engine
- deeper execution engine
- optimized memory subsystem

hardwareluxx.de/index.php/ne...

25.08.2025 16:04 — 👍 3    🔁 1    💬 1    📌 0
Preview
RIKEN, Japan’s Leading Science Institute, Taps Fujitsu and NVIDIA for Next Flagship Supercomputer Japan is once again building a landmark high-performance computing system — not simply by chasing speed, but by rethinking how technology can best serve the nation’s most urgent scientific needs. At t...

Those who have been around #HPC a while know the historical significance (and probably the names) of Japan's flagship supercomputers. They always push the boundaries of technology, but in ways that are driven uniquely by the needs of Japan.

blogs.nvidia.com/blog/fugakun...

22.08.2025 03:20 — 👍 5    🔁 1    💬 1    📌 0

@dssgabriel is following 19 prominent accounts