turtlespook's Avatar

turtlespook

@turtlespook.bsky.social

recovering graphics nerd and vtuber/anime fan โ€ข ๐Ÿ‘ปโœจโ€ข i love geometry and physics simulation!

38 Followers  |  275 Following  |  34 Posts  |  Joined: 20.10.2023  |  2.054

Latest posts by turtlespook.bsky.social on Bluesky

โ€œBut whatโ€™s it all for?โ€ I asked Oliver, to which he relied, โ€œTo make rocks kiss.โ€

09.11.2025 16:45 โ€” ๐Ÿ‘ 205    ๐Ÿ” 23    ๐Ÿ’ฌ 5    ๐Ÿ“Œ 0
Kaye
@kaye@cathode.church

Adopt the Juicero or be left behind. A vital shift is underway in juicing. The Juicero is no longer optional. It's tomorrow's future, today. 40% of jobs are impacted by the Juicero. The Juicero isn't the future, it's a present necessity. Nobody hand-juices anymore. To hand-juice is like an impairment. Everyone must now focus on the delegation and the verification of a juice. We become less juice producers and more juice enablers. Adopt the Juicero or be left behind. We are burning every forest and poisoning every river to produce more Juiceros. You will become obsolete if you don't get on the Juicero bandwagon. Students must not be taught how to hand-juice. 80% of jobs will be lost to the Juicero. Students must be taught to exclusively focus on how to collaborate with the Juicero. Education must focus on orchestrating agentic Juicero systems. The Juicero is inevitable. Adopt the Juicero or be left behind. Adapt or risk becoming obsolete. As the Juicero rapidly advances toward automating up to 90% of juicing, the skills that will matter most include juice design, Juicero fluency, juice delegation, and juice quality assurance. 110% of jobs have been replaced by the Juicero.

Kaye @kaye@cathode.church Adopt the Juicero or be left behind. A vital shift is underway in juicing. The Juicero is no longer optional. It's tomorrow's future, today. 40% of jobs are impacted by the Juicero. The Juicero isn't the future, it's a present necessity. Nobody hand-juices anymore. To hand-juice is like an impairment. Everyone must now focus on the delegation and the verification of a juice. We become less juice producers and more juice enablers. Adopt the Juicero or be left behind. We are burning every forest and poisoning every river to produce more Juiceros. You will become obsolete if you don't get on the Juicero bandwagon. Students must not be taught how to hand-juice. 80% of jobs will be lost to the Juicero. Students must be taught to exclusively focus on how to collaborate with the Juicero. Education must focus on orchestrating agentic Juicero systems. The Juicero is inevitable. Adopt the Juicero or be left behind. Adapt or risk becoming obsolete. As the Juicero rapidly advances toward automating up to 90% of juicing, the skills that will matter most include juice design, Juicero fluency, juice delegation, and juice quality assurance. 110% of jobs have been replaced by the Juicero.

cathode.church/@kaye/114988...

08.11.2025 17:29 โ€” ๐Ÿ‘ 1084    ๐Ÿ” 464    ๐Ÿ’ฌ 14    ๐Ÿ“Œ 13

Looking forward to the next parts! By the way, I think the embed for figure 8 might be broken?

06.11.2025 19:05 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0
A table showing the experimental results of applying 6 different compute shader versions of a simple 3x3x3 box blur on a 512x512x512 texture using either GL_R16F or GL_R32F internal format for storage for a eight different GPUs spanning several GPU architectures and vendors. 

The upper table shows the absolute effective bandwidth (measured as the sum of total bytes read and written divided by execution time), whereas the lower table shows the effective bandwidth relative to the theoretical bandwidth as a percentage. 

Each row corresponds to a specific shader variant (except for the "theoretical" row, which displays the theoretical bandwidth according to the GPU specification), and each column corresponds to a specific GPU. The color coding is per column in the upper table, and it's a single color coding on the entire lower table. 

Each version will be explained in detail in the subsequent posts.

Version 6 applies uses half precision floating point for the shared memory cache, and the relevant extension does not exist in the Intel drivers for Windows. Likewise this version is not applied to the GL_R32F internal format benchmarks since that would destroy the precision of the backing format anyway. 

The code was written and initially tested on a desktop 4090 (the first column), which naturally skews the results a bit since everything was evaluated and tested on that GPU. Had I used another GPU I might have picked slightly different compromises, and the results would have been slightly different. 

One interesting observation is that the RTX 4000 series (Ada Lovelace architecture) significantly overperform everything else, with 7900 XTX (RDNA3) slightly behind. A large part of these overwhelmingly efficient results is due to the massive caches these devices sport (72 MiB on the desktop 4090, 64 MiB on the laptop 4090, etc.), which really helps reach peak bandwidth a lot easier.

A table showing the experimental results of applying 6 different compute shader versions of a simple 3x3x3 box blur on a 512x512x512 texture using either GL_R16F or GL_R32F internal format for storage for a eight different GPUs spanning several GPU architectures and vendors. The upper table shows the absolute effective bandwidth (measured as the sum of total bytes read and written divided by execution time), whereas the lower table shows the effective bandwidth relative to the theoretical bandwidth as a percentage. Each row corresponds to a specific shader variant (except for the "theoretical" row, which displays the theoretical bandwidth according to the GPU specification), and each column corresponds to a specific GPU. The color coding is per column in the upper table, and it's a single color coding on the entire lower table. Each version will be explained in detail in the subsequent posts. Version 6 applies uses half precision floating point for the shared memory cache, and the relevant extension does not exist in the Intel drivers for Windows. Likewise this version is not applied to the GL_R32F internal format benchmarks since that would destroy the precision of the backing format anyway. The code was written and initially tested on a desktop 4090 (the first column), which naturally skews the results a bit since everything was evaluated and tested on that GPU. Had I used another GPU I might have picked slightly different compromises, and the results would have been slightly different. One interesting observation is that the RTX 4000 series (Ada Lovelace architecture) significantly overperform everything else, with 7900 XTX (RDNA3) slightly behind. A large part of these overwhelmingly efficient results is due to the massive caches these devices sport (72 MiB on the desktop 4090, 64 MiB on the laptop 4090, etc.), which really helps reach peak bandwidth a lot easier.

Let's wrap up this lovely week with a nice technical post

This is the "case study" from my Masterclass at GPC, where I apply a series of optimizations to improve the effective bandwidth of a 3x3x3 blur (a proxy for a huge set of operations on volumetric data)

Check ALT text for (a lot of) context.

17.11.2024 22:59 โ€” ๐Ÿ‘ 133    ๐Ÿ” 25    ๐Ÿ’ฌ 4    ๐Ÿ“Œ 4
Post image

I keep saying this but man I really love how effortlessly feminist #Skong is. Sometimes games need stand up and be like "THIS GAME. RESPECTS. WOMEN." and that's not even a bad thing per se, but there's no posturing in Silksong, it just *does* it.

04.11.2025 02:03 โ€” ๐Ÿ‘ 2    ๐Ÿ” 1    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
DEPRESSION STUDY ADVERTISEMENT ON THE BUS FEATURING SHINJI IN A CHAIR FEELING GUILTY ABOUT JERKIN IT TO A GIRL IN A COMA I THINK

DEPRESSION STUDY ADVERTISEMENT ON THE BUS FEATURING SHINJI IN A CHAIR FEELING GUILTY ABOUT JERKIN IT TO A GIRL IN A COMA I THINK

BRO WHAT

01.11.2025 22:23 โ€” ๐Ÿ‘ 1664    ๐Ÿ” 645    ๐Ÿ’ฌ 44    ๐Ÿ“Œ 59

Science fiction hits the hardest when it tackles real issues like the horrors of IEE-754 floating point

03.11.2025 04:34 โ€” ๐Ÿ‘ 3    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 1

I love it when science fiction tackles real issues, like the horrors of IEEE-754 floating point

03.11.2025 04:29 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

Some storyboards from Star Wars: Visions "Black" (Shinya Ohira, David Production).
Full video >> www.youtube.com/watch?v=_Y0i...
Season 3 is now streaming on Disney+.

29.10.2025 07:56 โ€” ๐Ÿ‘ 154    ๐Ÿ” 50    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 5
A screenshot from "BLACK" from Star Wars: Visions volume 3. Stormtroopers run through a burning battlefield, surrounded by blaster fire, the bodies of their fallen allies, and the scraps of ruined and crashed vehicles.

A screenshot from "BLACK" from Star Wars: Visions volume 3. Stormtroopers run through a burning battlefield, surrounded by blaster fire, the bodies of their fallen allies, and the scraps of ruined and crashed vehicles.

A screenshot from "BLACK" from Star Wars: Visions volume 3. A stormtrooper lit in shades of green with a broken mask revealing their haggard, bearded face looks down in horror at a similarly battered stormtrooper, albeit this one lit in shades of red.

A screenshot from "BLACK" from Star Wars: Visions volume 3. A stormtrooper lit in shades of green with a broken mask revealing their haggard, bearded face looks down in horror at a similarly battered stormtrooper, albeit this one lit in shades of red.

A screenshot from "BLACK" from Star Wars: Visions volume 3. Two stormstroopers grasp their hands in a firm shake, one lit in shades of green, the other lit in shades of red.

A screenshot from "BLACK" from Star Wars: Visions volume 3. Two stormstroopers grasp their hands in a firm shake, one lit in shades of green, the other lit in shades of red.

A screenshot from "BLACK" from Star Wars: Visions volume 3. A stormtrooper stands on a snowy planet over the buried corpse of one of their fellow troopers, as they look out on the ruins of a TIE Fighter.

A screenshot from "BLACK" from Star Wars: Visions volume 3. A stormtrooper stands on a snowy planet over the buried corpse of one of their fellow troopers, as they look out on the ruins of a TIE Fighter.

like, holy shit man, what an incredible thing to witness, 13 minutes of smooth jazz and the human condition

30.10.2025 20:28 โ€” ๐Ÿ‘ 6    ๐Ÿ” 2    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 1
Post image

The final short of Star Wars: Visionsโ€™ third volume, Black, is one of the most visually striking, boldly experimental, and unexpectedly affecting things to happen to that galaxy far, far away. A stunning jazz-infused presentation of the death spiral of a doomed stormtrooper from David Production.

29.10.2025 18:18 โ€” ๐Ÿ‘ 6    ๐Ÿ” 2    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Post image Post image

The integral + code:

29.10.2025 21:52 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
Post image Post image

I got a pretty similar plot for the 2D case!

29.10.2025 21:52 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

I really love that this and silksong were much more direct in their storytelling, with a โ€œvoicedโ€ player character that actively participates in NPCsโ€™ relationships and the main story

it adds so much over the silent-protagonist vaguely gestured grimdark-ness thatโ€™s staple of soulsvania genres

28.10.2025 04:44 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

(I guess the usual approach to do a global sort/compact step in a separate kernel. That's ideal for coherence, but involves "cutting" the original kernel and spilling intermediate thread state to global memory. I'm wondering if this can be avoided)

26.10.2025 09:23 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

We'd then have a lot of fully-coherent warps, half of which could early-out and (presumably) release their resources to the GPU scheduler

That's ultimately my question: does this intra-block sorting idea actually work? Or is GPU work always scheduled + retired at the level of whole blocks?

26.10.2025 09:23 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

One piece of missing information is how the threads are organized in warps. If each warp has a mix of early- and late-threads, it takes both paths and can't release resources until t2. OTOH, I could run a sort ahead of the branch, but still within the kernel, to put the threads in contiguous warps

26.10.2025 09:16 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

A question for GPU peeps: say I launch a kernel with huge block size, such that only 1 block fits on the SM at a time. But the kernel's execution is also divergent: half the threads finish early (t1) while the others run for max duration(t2). Will the GPU start processing the next block at t1 or t2?

26.10.2025 09:16 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

happy birthday!!

06.10.2025 16:13 โ€” ๐Ÿ‘ 1    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

editing every AI revolt cyberpunk story to be about how white people give chatbots more rights than minorities

05.10.2025 19:45 โ€” ๐Ÿ‘ 37    ๐Ÿ” 10    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0
gilbert poses after his pizzas fall on the floor

gilbert poses after his pizzas fall on the floor

a Wikipedia editor with the username Vigilantcosmicpenguin explained, "I contacted Brian David Gilbert with a request for freely licensed photographs. He delivered."

05.10.2025 10:12 โ€” ๐Ÿ‘ 3846    ๐Ÿ” 891    ๐Ÿ’ฌ 16    ๐Ÿ“Œ 14

the stealth star of this week's apocalypse hotel is yoshiaki fujisawa. that was some brian eno shit happening in the soundtrack

17.06.2025 23:55 โ€” ๐Ÿ‘ 26    ๐Ÿ” 4    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 1

apocalypse fucking hotel

17.06.2025 23:27 โ€” ๐Ÿ‘ 34    ๐Ÿ” 7    ๐Ÿ’ฌ 5    ๐Ÿ“Œ 0

She 100% does cackle over your corpse when you lose to her. But alas, you were too good at the game :^)

27.09.2025 22:03 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

"No one can play Silksong because its release took down Steam and the Nintendo Store" is really the funniest possible outcome, well done everyone

04.09.2025 14:11 โ€” ๐Ÿ‘ 56    ๐Ÿ” 3    ๐Ÿ’ฌ 2    ๐Ÿ“Œ 0

currently playing the most difficult metroidvania of them all. lots of enemies. limited mobility upgrades. locked doors everywhere. largely consists of backtracking. it's called employment

04.09.2025 16:28 โ€” ๐Ÿ‘ 273    ๐Ÿ” 95    ๐Ÿ’ฌ 4    ๐Ÿ“Œ 0
This many points is surely out of scope! ยท Aras' website

Great article by @aras-p.bsky.social comparing the perf of rasterizing single-pixel triangles vs. compute-shader splatting.

The most fascinating part of this is how much different GPUs differ in comparative perf, and how they respond to different scenarios.

aras-p.info/blog/2025/08...

26.08.2025 18:12 โ€” ๐Ÿ‘ 6    ๐Ÿ” 3    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 0

congrats to Team Cherry for never having to learn about JIRA

21.08.2025 14:49 โ€” ๐Ÿ‘ 146    ๐Ÿ” 37    ๐Ÿ’ฌ 1    ๐Ÿ“Œ 11
Post image

SILKSONG SEPTEMBER 4TH BABYYYYYYYYYY

21.08.2025 15:01 โ€” ๐Ÿ‘ 496    ๐Ÿ” 47    ๐Ÿ’ฌ 5    ๐Ÿ“Œ 2

hope yโ€™all have a great time!

20.08.2025 10:55 โ€” ๐Ÿ‘ 0    ๐Ÿ” 0    ๐Ÿ’ฌ 0    ๐Ÿ“Œ 0

@turtlespook is following 20 prominent accounts