Ben Sims's Avatar

Ben Sims

@arycama.bsky.social

Senior Programmer. Graphics, C++, C#, Shaders, Unity, Unreal, PC, Console, Mobile, cats. github.com/arycama mastodon.gamedev.place/@Arycama twitter.com/BenSimsTech

975 Followers  |  563 Following  |  43 Posts  |  Joined: 16.08.2023
Posts Following

Posts by Ben Sims (@arycama.bsky.social)

A few other interesting highlights, uses my improved GTAO with the fix for the darkening issue, reworked screen space GI which helps significantly for foliage, especially after darkening introduced by GTAO. Finally, is also using the GT7 HDR tonemapper from a recent presentation.

01.09.2025 14:09 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image Post image Post image

Just some mildly moody screenshots from the thing that I'm trying to make into a thing. Also pictured: rewritten gpu driven foliage, terrain and grass renderers. Writing a device radix sort was fun-ish.
Currently reworking the terrain shader, needs better height blending.

01.09.2025 14:09 β€” πŸ‘ 5    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image Post image Post image Post image

Shaders are fun. Sky+lightrays rendered together with 1 sample per pixel+temporal with importance sampling) Clouds+sky integrate with ambient probe and aerial perspective so environment light is cohesive.
End result is quite nice, only change between these pics is sun+view angle

08.05.2025 17:41 β€” πŸ‘ 50    πŸ” 4    πŸ’¬ 0    πŸ“Œ 0

Final result is an array of input data for visible indices, but ordering of instances and data is preserved, so front to back sorting can be performed/maintained and multiple instance types can be culled/sorted in a single pass. Now to actually implement a parallel radix sort.

26.01.2025 14:28 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

Simple multipass parallel prefix sum+stream compaction. First pass writes groups of 1024 scans and max value for each group. Pass 2 does a prefix sum on the group sum array to compute a final offset for each group of 1024 which is used as an offset for final pass write indices.

26.01.2025 14:28 β€” πŸ‘ 12    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

It was difficult to find a simple, concise parallel prefix sum example online, even the GPU gems code was obscure and had bugs.
I wrote a simple one, only handles a 1024 array, but can be expanded to larger arrays with multiple passes.
Not the most efficient method, but simple

24.01.2025 18:56 β€” πŸ‘ 18    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

At the end of the line above, I mask and bit shift each component by the required offset, so once they are bit or'd together, they are all occupying their own bits in the final integer.

24.01.2025 01:48 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

This was the code I was previously using, which I found online. It uses intrinsics to convert to fp16 and then does some extra bit shifting. It didn't quite seem precise enough for extents as objects would disappear towards the edges of the screen.
Might be slightly faster though

23.01.2025 15:18 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

I use this for storing compact bounds data for GPU-driven culling. I store world-space bounds as a float4, with .xyz being the world space bounds center, and .w being the bounds extents packed into an R11G11B10. It's unsigned and precision isn't super important, so it works well.

23.01.2025 15:18 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Optimised (I think) HLSL for converting float3 to and from R11G11B10. Input values must be in range, doesn't handle -ve/denorms/nan/inf.
Handy for packing positive float3 data into a single uint.
Convert to+from is 21 DXC instructions.
Handy for packing 3D data into a single int

23.01.2025 15:18 β€” πŸ‘ 11    πŸ” 0    πŸ’¬ 3    πŸ“Œ 0
Visual Studio Submodules and Unity Git Packages I’ve been using Unity git packages to develop a few different libraries alongside my game projects, such as my Custom Render Pipeline. While the setup isn’t too complicated, and Unity will automatical...

I wrote a blog post about using #Unity packages, git submodules and Visual Studio together. Would be really cool if this kind of thing just worked out of the box (Eg add package as git submodule, and that's it), but oh well, this kind of works!

arycama.github.io/2024/12/21/v...

21.12.2024 16:19 β€” πŸ‘ 9    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

I think I'm ready to start writing more graphics/shader blogs.

I have many complex subjects I'd love to delve into, but what are some graphics/shader related techniques/topics you would be interested in knowing more about?

These won't be full tutorials though, some prior knowledge will be assumed

26.11.2024 11:57 β€” πŸ‘ 19    πŸ” 0    πŸ’¬ 4    πŸ“Œ 0

Let content creators have control of storage and monetisation. They could self host with no ads, or pay for cloud storage, and either use ads or just fund it themselves. No reason to pick a one size fits all solution. As long as the end user can browse all channels in one place it should work.

08.11.2024 07:24 β€” πŸ‘ 4    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

Being part of a wall of death while Periphery played Blood Eagle was certainly an interesting way to spend a Thursday night lol

10/10 background art too

07.11.2024 11:29 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Code isn't the nicest, but GPUs don't mind too much these days. πŸ˜…

Some terms could be pre-calculated if needed.

02.11.2024 17:55 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image Post image

I adapted it for use in my above-water importance sampling (Which previously did not account for light angle or increased attenuation with depth) and the result is less variance for the same sampling cost. (A bit more ALU to calculate distance+pdf though ofc)

Both images are 1 sample per pixel.

02.11.2024 17:53 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

I inverted it btw, the equation is quite cursed, but it works. Had to re-derive using a normalized CDF so that a random number from 0-1 will always map to a valid value.

Now I can importance sample underwater scenes nicely, handy as a base for more complex effects!

www.desmos.com/calculator/b...

02.11.2024 17:47 β€” πŸ‘ 14    πŸ” 1    πŸ’¬ 2    πŸ“Œ 0
Post image Post image

Thanks! Your response made me think of a different way to write the integral, and I plugged that in and got a closed form solution: lx(exp(-ca/l)-exp(-c(lx+b)/l))/(c(b+lx-a))

(Left pic is numerical integration, right is analytical. Also handles large distances (eg far plane))

Now to invert it πŸ˜…

01.11.2024 08:57 β€” πŸ‘ 9    πŸ” 1    πŸ’¬ 1    πŸ“Œ 1

Interesting, will take a look! I tried GPT as a last resort lol but the equations it gave me did not match the integral. (They were close in some spots but far off in others, especially with certain parameters). It even acknowledged it was wrong and tried to blame numerical integration accuracy lol.

01.11.2024 05:07 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

As a side-note, I wish there was some way to use Mathematica for one-off solutions.. like "Pay to use for one hour" or "Pay for a certain number of equations" or something. It's really useful when I need it, but it's not common enough to justify a license.

01.11.2024 04:22 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

used for importance sampling. (Eg you calculate shadows, proper depth, or shoot a ray from sample point if using raytracing)

Have run into this problem with sky rendering too however, an analytical solution does not exist. So had to generate a lookup table. Could do the same here if needed.

01.11.2024 04:21 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

This gives the amount of "luminance" from a light source over the path from viewer to pixel underwater, assuming the water surface is a flat plane and water density is constant. (Both are reasonable assumptions for calm water)

Analytical solution can be used directly for rendering, or inverted and

01.11.2024 04:20 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Preview
Underwater Extinction

Equation for underwater rendering I'm trying to solve.. Assume viewer is at depth "A", target pixel is at depth "B", "L" is dot(lightDir, up), and c is extinction coefficient. Wolfram doesn't give an analytical solution and I have no mathematica license. How to solve?
www.desmos.com/calculator/s...

01.11.2024 04:19 β€” πŸ‘ 4    πŸ” 1    πŸ’¬ 2    πŸ“Œ 0

Haha, yeah that's kind of how it feels. Though tbh I spent the last 6 months doing house repairs+renovations and hobby-dev was a bit less of a priority, but now that's all done, so hoping to get back into it a bit more, and hopefully try to write down some of my findings for others to learn from!

25.10.2024 15:46 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Not yet. Been meaning to find more time to finish setting it up, maybe this weekend. Anything in particular you'd be interested in? Was thinking about writing up some info about imposter baking+shading that I haven't been able to find much info on. (Most info is very surface level)

25.10.2024 01:03 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 2    πŸ“Œ 0

Yep, due to a bug in my RNG, red and blue ended up producing identical values.

25.10.2024 01:01 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

I made the hash functions to be able to easily hash from a 1D/2D/3D/4D seed value. But seems that I mixed them up from calculating 4 different hash values from a float4. I decided to rename the existing ones like so, less ambiguity. (And now the compiler will give me nice vector truncation warnings)

24.10.2024 11:50 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

My PcgHash function was taking a float4, but returning a single float. Supposed to take 4 floats and do 4 seperate hashes instead of combining into one. This was then being passed to the GaussianFloat4 which was attempting to make 4 Gaussian numbers from a single identical seed.

24.10.2024 11:49 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

An important takeaway is that double checking all your inputs can be an easy way to discover a bug. At first I didn't realize that all my gaussian X and Z components were identical. Outputting the data as a texture made it very easy to detect though, and from there, easy to fix.

24.10.2024 10:58 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

One thing I don't understand is that I use this for a lot of procedural terrain logic.. and it looks mostly fine? (There are a few minor square-like issues which I assume are due to this though)

Well, at least that will now be improved!

24.10.2024 10:51 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0