Ashton Six @ashtonsix.com - Bluesky Profile

Ashton Six

@ashtonsix.com.bsky.social

Research Engineer (software), with interests in superoptimisation, fast integer compression, and indexing for OLAP

27 Followers | 223 Following | 7 Posts | Joined: 16.01.2026 | 1.6866

Latest posts by ashtonsix.com on Bluesky

I want bare metal instances that can launch within 2-3 seconds, for a better (local dev <-> remote execution) REPL workflow

vs. fast launching containers (eg, Cloud Run), bare metal gives me more reliable benchmark measurements and the ability to bring tools like Nsight

17.01.2026 18:28 — 👍 2 🔁 0 💬 0 📌 0

i've found some success sticking to SIMD-friendly scalar patterns

i get loop order right (polyhedral analysis), add hints like `#pragma omp simd`, and run at -O3: that's _usually_ enough. you can check output with -S (gives readable ASM)

or use SIMDe, that works too

17.01.2026 02:34 — 👍 0 🔁 0 💬 0 📌 0

1. Introduction — PTX ISA 9.1 documentation

This feels like a continuation of the reduction operators introduced in Blackwell's TMA (cp.reduce.async.bulk). Fun fact! Data movement often dominates power usage vs compute because of physics: thicker longer wires = more power needed to transmit each bit. Makes a lot of sense to optimise here.

17.01.2026 02:02 — 👍 3 🔁 0 💬 0 📌 0

Mmm! Nice corollary: software optimisations for prefix sums (re-parenthesizing) generalise across associative ops: +, ^, prefix-of-prefix.

I made a thread about it: bsky.app/profile/asht...

17.01.2026 01:31 — 👍 0 🔁 0 💬 0 📌 0

perf-portfolio/delta at main · ashtonsix/perf-portfolio HPC research and demonstrations. Contribute to ashtonsix/perf-portfolio development by creating an account on GitHub.

Full write-up, implementation (NEON) and benchmark results (Graviton4) here: github.com/ashtonsix/pe...

I love solving these kinds of performance puzzles—and I'm currently available for hire! Reach out if interested 😊. 3/3

17.01.2026 00:55 — 👍 0 🔁 0 💬 0 📌 0

The ILP trick:

# Local prefix sums
out[0..3] = prefix(in[0..3])
out[4..7] = prefix(in[4..7])
...

# Late carry broadcast (redundant compute)
out[4..7] += out[3];
out[8..11] += out[7];
...

By delaying the carry we allow the CPU to compute all local prefix sums in parallel, >doubling throughput. 2/

17.01.2026 00:55 — 👍 0 🔁 0 💬 1 📌 0

I got SOTA (L1-hot, SIMD) on prefix sum by ADDING instructions (7.7 GB/s → 19.8 GB/s). Consider:

for i = 0..n: out[i] = out[i-1] + in[i]

This SUCKS, because out[i] must wait on out[i-1]. There's an unbroken dependency chain which disrupts Instruction Level Parrallelism (ILP). 1/

17.01.2026 00:55 — 👍 0 🔁 0 💬 1 📌 1

@ashtonsix.com is following 20 prominent accounts

Dare Obasanjo
@carnage4life

Opinions about product management, technology news and inclusivity in tech. Diversity is about demographics, inclusion is about creating a sense of belonging.

Nils Berglund
@nilsberglund

Mathematician. Likes nature, hiking, cycling, the Arctic, drawing, photography, cooking. Makes colorful simulations of mathematical and physical systems. All posted photos are either my own work, or, in a few cases, by a fellow traveler.

Freya Holmér
@freya

🎮 indie tech artist 🏗️ I made Shader Forge & Shapes 🌐 working on https://half-edge.xyz 🔥 shader sorceress 📏 math dork 🎥 rare YouTuber/streamer 📡 ex-founder of @NeatCorp my kids: 🥪 @toast.acegikmo.com 🥗 @salad.acegikmo.com 🐈‍⬛ @thor.acegikmo.com

stellz
@piss.beauty

dance music enjoyer & technology sister. 🌹Brooklyn music mixes: https://plyr.fm/u/piss.beauty

@terrorjack

Compiler Explorer
@compiler-explorer.com

A website for exploring the output of compilers. aka godbolt.org Supports C, C++, Rust, Fortran, COBOL and many many more. Support us at https://patreon.com/mattgodbolt

Matt Godbolt
@matt.godbolt.org

Sometime verb, real person, lover of 8-bit computers, husband & father, trying to be a kind person. #blacklivesmatter; trans rights are human rights. he/him

Tez Gale
@tezgale.co.uk

Daily juggler of jobs: HPC Consultant at Red Oak Consulting, Director of Outdoor Centre, Farmer, Teaching Bushcraft/Forest skills, Electrician, Father/Husband to a large clan, & Beekeeper. Yes it's a weird mix. #HPC #Outdoors #beekeeping #farming #sparky

Gareth Wilson
@garethddn

I work in HPC and AI, love playing guitar, listening to guitars, staring at guitars and the proud nephew of three magnificent uncles.

Paul Selwood
@pselwood

HPC user in the UK

@richlawrence

Mahesh Pancholi
@maheshpancholi

Tom Walsh
@bretnac

BigEyedBeans
@bigeyedbeans

Work: research platforms engineer (HPC, cloud). Before: research software engineer. Play: muddy bikes, climbing. He/him.

Glenn K. Lockwood
@glennklockwood.com

I am a supercomputing enthusiast, but I usually don't know what I'm talking about. I post about large-scale infrastructure for #HPC and #AI.

Andrew Jones (hpcnotes)
@hpcnotes

25+ years using + researching + buying + architecting #supercomputing Now engineering leader for future #AI infrastructure + #HPC capabilities at Microsoft Posts about #supercomputers #AI #technology #F1 #LFC #aviation #travel … https://www.hpcnotes.com

Fernanda Foertter
@hpcprogrammer

A yellow shirt =/\=

HPC Guru
@hpcguru

Not a HPC Guru, but I play one on social media

HPC Group @ Uni Basel
@dmi-hpc

HPC group lead by Florina Ciorba at Department of Mathematics and Computer Science University of Basel. hpc.dmi.unibas.ch

HPC Engineering @ AWS
@techhpc

We're the HPC Engineering team from AWS, and we publish stuff about running R&D workloads in the cloud. Follow us here, and on YouTube (hpc.news/techshorts), too.