Diamond Rapids用AMX命令群(AMX-{MOVRS,AVX512,FP8,TF32,TRANSPOSE})に対応したXbyak 7.26 release。
github.com/herumi/xbyak
@lmrx114514.bsky.social
やったぜ。クリアな視界
Diamond Rapids用AMX命令群(AMX-{MOVRS,AVX512,FP8,TF32,TRANSPOSE})に対応したXbyak 7.26 release。
github.com/herumi/xbyak
AVX10.2がrev 4.0でYMMレジスタの埋め込み丸めやsae/erなどの仕様が削除されたけどXbyakはまだ残ってた(xed 9.53も残ってる)けど、何かとトラブルになるのでその機能を削除してv7.27をリリースした。
github.com/herumi/xbyak...
XbyakにSolaris対応のちょっとしたpull reqが来たのだけど、Solarisってまだがんばってるんだ。知らなかった。大学のときに触ってた以来だなあ。
24.07.2025 23:54 — 👍 1 🔁 1 💬 0 📌 0定数除算最適化再考3 コンパイラを越えろ
zenn.dev/herumi/artic...
x64/M4でアセンブリ言語レベルでの試行錯誤の結果を書きました。
Why limit it to Intel? :)
03.10.2025 20:44 — 👍 2 🔁 1 💬 0 📌 0Let me add another bit to the rant. I am sick as shit of sites, youtube idiots, and the rest puking back obvious BS without the bare minimum of checking. 3 sites said it so it has to be true! You are being used by people with ulterior motives, don't be a tool. No hope for society....
03.10.2025 19:54 — 👍 5 🔁 1 💬 1 📌 0Is #AMD really fabbing at #Intel? We got the definitive story. Really, not clickbait like the others, we got to the bottom of it!
www.semiaccurate.com/2025/10/03/i...
With the changes today, I am now FIRMLY back in the "Intel will die" camp. The company is avoiding the problem source and addressing the symptoms. Badly. In a way that will worsen the problem. I plan on writing this up as soon as I get free time, tomorrow is shot though. More soon.
08.09.2025 22:26 — 👍 6 🔁 2 💬 0 📌 0Remember how people laughed when I said, in 2019, that I saw a clear path to #Intel failing? I wasn't joking. Then Pat Gelsinger came in and addressed the root of the problem turning things around.
08.09.2025 22:26 — 👍 11 🔁 2 💬 2 📌 0Those things you cite had a different purpose, basically to distract Wall Street from viewing Intel as not a player in 'hot' markets. Nothing more nothing less. I can name a dozen others too.
09.09.2025 20:20 — 👍 1 🔁 1 💬 1 📌 0Size wasn't the issue, culture was. Pat fixed that mostly, but will those changes stick?
09.09.2025 15:08 — 👍 1 🔁 1 💬 1 📌 0Very interesting to me that it will only be a 6 X increase in DGEMM.
Fugaku is famously under provisioned for low precision flops, I must wonder if this is an over correction?
TT-QuietBox (Blackhole)
Look what just landed in the lab
12.09.2025 14:35 — 👍 9 🔁 2 💬 1 📌 0Here is an early mention of 512b vectors in a 2009 #Intel #Nehalem optimization slide:
30.08.2025 20:16 — 👍 3 🔁 2 💬 2 📌 0Well... AVX512 was known as AVX3 at some point... (and AVX512F was AVX3.1 for KNL guess) .. SKX was supposed to be AVX3.2 (F, CD, BW, DQ, and VL)
05.09.2025 18:04 — 👍 3 🔁 1 💬 0 📌 1It's too bad that this AVX3.1 nomenclature disappeared. I think it is much more seamless, like SSE4.2, SSE5 (original name of AMD XOP) or Armv8.2. AMX2 and AMX3.1 would be also better.
06.09.2025 16:39 — 👍 4 🔁 2 💬 0 📌 0#Intel xAPIC depreciation plan 1.0:
www.intel.com/content/www/...
#Intel refreshed the xAPIC depreciation plan with #NovaLake and #DiamondRapids:
19.09.2025 11:55 — 👍 2 🔁 1 💬 0 📌 0There are a few working #PantherLake B0_2 among #Intel test machines: (CPUID C06C2, 12c/12t (4P+4E+4LPE probably), 3000 MHz, no HTT, no AVX512, Intel 18A)
intel-gfx-ci.01.org/tree/intel-x...
#CougarCove #Darkmont
For comparison, #LunarLake was 3100MHz (8c/8t 4P+4LPE) at similar stage
#Intel microcode refresh 20250812:
github.com/intel/Intel-...
Release Notes:
github.com/intel/Intel-...
#AMD refreshed the "AMD64 Architecture Programmer's Manual, Volumes 1" 24592 pdf to v3.23 with #AVX512
docs.amd.com/v/u/en-US/24...
New story up on #ARM's Neural Super Sampling Tech. Not deep, waiting on hardware details.
www.semiaccurate.com/2025/08/12/a...
Kicking off the EUMaster4HPC FPGA workshop with Intel oneAPI on #MeluXina! Huge thanks to @luxprovide.bsky.social, and @uni.lu for making this happen. On today's agenda: Hands-on learning with @eurohpc-ju.bsky.social and Luxembourg’s national supercomputer!
#HPC #Supercomputing #oneAPI #FPGA
seems intel has since redacted this section...
11.07.2025 16:48 — 👍 8 🔁 1 💬 2 📌 0💥Spack v1.0 is out!💥
This is a huge milestone. We reworked the core to add compiler dependencies, and we're introducing a stable package API.
🚀1.0 also adds concurrent builds, better includes, and much more -- read it all in the release notes!
github.com/spack/spack/...
OK, serious question that I have been thinking about for a while. What is a PC? Is it x86? Windows? Form factor? Screen size? Features? Discuss.
I am seriously interested in what you think. I have asked several industry heavyweights and gotten as many answers as people I asked. Your turn.
#Intel released the 58th edition of the ISA Extensions Reference with fixes and clarifications:
#PantherLake #DiamondRapids #ClearwaterForest
Download:
cdrdv2-public.intel.com/859029/31943...
CPUID.(EAX=23H,ECX=0H):EBX[2] = #RDPMCUserDisable
#Intel released the 88th edition of the Software Developer’s Manuals with #SLSM (Static LockStep Mode) integrity feature:
CPUID.(EAX=07H,ECX=01H):EDX[24]=SLSM
All-in-One:
cdrdv2-public.intel.com/858440/32546...
Changes:
cdrdv2-public.intel.com/858441/25204...
#Intel refreshed the "Flexible Return and Event Delivery" ( #FRED) 346446 pdf to 9.0:
cdrdv2-public.intel.com/819481/34644...