Christoph Lutz's Avatar

Christoph Lutz

@christophlutz.bsky.social

My drinking club has a skydiving problem

137 Followers  |  82 Following  |  154 Posts  |  Joined: 29.10.2024  |  1.8147

Latest posts by christophlutz.bsky.social on Bluesky

Post image

Week end fun: snooping inter-process messaging (ksbasend) in Oracle with bpftrace. πŸ€“

t.ly/_yDy7

31.10.2025 20:06 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
LinkPro Linux Rootkit Uses eBPF to Hide and Activates via Magic TCP Packets Synacktiv uncovered LinkPro, a Golang rootkit using eBPF hide/knock modules activated by TCP window 54321.

thehackernews.com/2025/10/link...

20.10.2025 10:22 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

This, so much! πŸ‘‡

16.10.2025 14:17 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Unraveling eBPF Ring Buffers The goal of this post is to provide an in-depth discussion of BPF ring buffers, covering their internals, including memory allocation, user-space mapping, locking mechanisms, and efficient data sharin...

www.deep-kondah.com/deep-dive-in...

11.10.2025 14:46 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

When you plan to geek out over some oracle internals, but end up ftrace’ing bpf the entire week end to chase a funny bug that only occurs on exadata with capacity on demand ...

05.10.2025 17:30 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

There's a problem on the oracle-l listserver at present about an insert taking far too much time (and CPU). It's a known issue and there are 47 statistics in v$sysstat (19.11) with names like 'ASSM%' to help diagnose it.

How many do you think are described in the database reference manual?

None.

28.09.2025 21:31 β€” πŸ‘ 4    πŸ” 1    πŸ’¬ 0    πŸ“Œ 0

Interesting little detail: every loop iteration calls the "pause" instruction, providing a hint to the cpu that the code is in a spin-wait loop. This allows the cpu to improve spin-loop efficiency and power consumption.

Observed on 19.26, running on Exadata X10, version 25.1.7.

21.09.2025 12:10 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image Post image

Yet another adaptive lgwr optimization: on Exadata X10+, pipelined log writes may defer redo writes until a suitably sized write batch has accumulated in the log buffer.

The deferral can involve spinning in a tight loop up to 25 times (maximum hard-coded in kcrfw_defer_write).

21.09.2025 12:09 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image

Nested loops, baby 😜

18.09.2025 08:40 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image 13.09.2025 11:27 β€” πŸ‘ 5    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
SOUGDay 2025 ZΓΌrich - Anmeldung unter soug.ch mΓΆglich

SOUGDay 2025 ZΓΌrich - Anmeldung unter soug.ch mΓΆglich

In den nΓ€chsten Tagen verΓΆffentlichen wir nicht nur die Agenda, sondern am Tag nach dem SOUG Day planen wir noch ein Special fΓΌr Euch! Schaut rein und meldet Euch an unter soug.ch.

10.09.2025 11:26 β€” πŸ‘ 3    πŸ” 3    πŸ’¬ 0    πŸ“Œ 0
Post image

So glad that all new features are documented so well... NOT 😜

Manually enabling and disabling adaptive lgwr evaluation trace for pipelined / overlapped redo writes:

04.09.2025 16:35 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

POUG journey started… not even at the airport and lufthansa’s delay notification leaves no hope of making the connecting flight in MUC πŸ™ˆ

03.09.2025 06:50 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

It's still there, but it seems unused, meaning there are no direct calls to it from other functions (not sure about indirect calls, but I doubt).
Reason for checking was the updexe code path (in version 23.6), where errors are now signalled by kseseclv when interesting things happen. πŸ˜€

01.09.2025 08:06 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

So 23ai replaces ksesecl0(func, loc, err) with kseseclv(err, func, loc, ...) ... Why is it always just a few days before POUG that this kind of low-level discoveries surface? πŸ™„

31.08.2025 12:18 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 2    πŸ“Œ 0

Wouldnβ€˜t a bottle opener be more appropriate for POUG? πŸ˜‰

31.08.2025 07:18 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image Post image

A new tool in 0x.tools family:

xtop - Top for Wall-Clock Time. It uses eBPF/xcapture v3 and gives you "x-ray vision" into Linux system activity.

It will be available on next Tuesday 19 Aug at 1pm EDT when I also run a live demo webinar!

tanelpoder.com/posts/xtop-t...

13.08.2025 05:23 β€” πŸ‘ 19    πŸ” 9    πŸ’¬ 1    πŸ“Œ 0

"Slide n of 142".. this is getting out of control... πŸ™ˆ

12.08.2025 19:57 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Very nice! We're planning to do that as well in some environments. What were the numbers before the change?

12.08.2025 19:34 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

How many nodes does it have now?

07.08.2025 19:11 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Hell yeah... but then Oracle gives me more, and I get even paid for it πŸ˜‚

06.08.2025 17:09 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Today's discovery (19.26): Oracle derives different log parallelism defaults, depending on platform. Max number of public redo strands is:

Exadata:
CPU_COUNT <= 256: 16
CPU_COUNT> 256: CPU_COUNT/16

Non-Exadata:
CPU_COUNT <= 32: 2
CPU_COUNT > 32: CPU_COUNT/16

Max limit: 256

06.08.2025 17:07 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Enough adrenaline for the day?

06.08.2025 12:51 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 2    πŸ“Œ 0

Things are confusing, in many places timing information comes from sltrgftime64, which on Linux is a wrapper around clock_gettime with ns resolution, but the results get normalized to us.

I guess not all platforms had high-resolution timers and that's part of the reason why things are a bit messy.

06.08.2025 07:26 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

Yeah, I can imagine… but that click moment when it finally makes sense? Pure magic, no? 😊

05.08.2025 20:58 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

I spent hours looking for milliseconds πŸ™ˆ

05.08.2025 20:40 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

Oh my, looks like the "ms" in "kso_sched_delay_avg_ms" actually means "microseconds" ... πŸ€·β€β™‚οΈ

05.08.2025 19:46 β€” πŸ‘ 3    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0
Post image Post image

6/6
As always, bpftrace is very useful for observing and studying undocumented behavior:

Trace write info array updates (LGWR/LGnn): t.ly/VV--a
Trace write info array scans (FG): t.ly/67R-J

04.08.2025 08:02 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

5/6
The "redo synch time overhead" is used to calculate the "redo synch long waits" statistic, a key metric the Adaptive Log File Sync (ALFS) mechanism uses to decide whether to perform a mode switch from post/wait to polling (more on that another day ...)

04.08.2025 08:02 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

4/6
Concurrent updates to the write info array are protected by the "log write info" latch, which LGWR and LG workers acquire in no-wait mode. If they fail to acquire the latch, they skip the write info array update and proceed without recording the redo write completion time.

04.08.2025 08:02 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 1    πŸ“Œ 0

@christophlutz is following 20 prominent accounts