From Nitro to Junction: Testing in Production at Scale
During my time at AWS, I learned that even the most rigorous pre-production testing has limits. I share how we built Nitro's reliability by treating production as part of the test loop—with proper saf...
At AWS, I led the team that built EC2's Nitro virtualization stack— C code deployed to 500K servers. Biggest lesson? Pre-production testing has limits. You must embrace safe production testing.
My new blog explains how Nitro did it, and how Junction brings this to any Kubernetes team.
20.05.2025 15:44 — 👍 7 🔁 3 💬 1 📌 0
DNS and the December 2024 OpenAI Outage
In December 2024, OpenAI faced a significant outage lasting approximately four hours. This incident highlighted a critical challenge in container orchestration: maintaining reliable service discovery ...
In my new blog post for Junction Labs, I explore service discovery by delving into the December OpenAI outage. I analyze root cause and discuss a principle we grasped during the development of EC2's Networking: static stability. Check out the full post here: www.junctionlabs.io/blog/dns-and...
28.01.2025 16:48 — 👍 4 🔁 2 💬 0 📌 0
Associate Professor at @cst.cam.ac.uk, researching decentralised systems and security protocols. Advisor to the Bluesky team. Wrote “Designing Data-Intensive Applications” (O’Reilly). he/him
Software Engineering and Product Leader
Databases and distributed systems. Working at Datadog on Metrics.
https://artem.krylysov.com/
CEO Stanza Systems. Speaking in personal capacity. Author/instigator SRE books, Reliable Machine Learning, History of the Irish Internet. Photography at http://www.edge-cases.photos
networking nerd | dog enthusiast | forty-something.
no DMs plz.
Write at lethain.com. Author of An Elegant Puzzle, Staff Engineer, and An Engineering Executive’s Primer. Worked some places.
Founder and CEO at Nile (https://www.thenile.dev/). Previously: VP of Engineering at Confluent. Believe to win.
Startup founder. Infrastructure engineer.
I know stuff about DNS and most of my networking knowledge is L4 and above.
I like distributing systems.
I also like going to the mountains.
Freelance data engineer @ bitsondisk.com , based in Los Angeles. Former world heavyweight sorting champion.
🏳️🌈 san francisco, calif.
👨💻 apache cassandra @
✊ alice lgbtq dems + victory fund
🙇♂️ all views personal
I love to take photos and enjoy traveling, art, and history. I spend many hours supporting my local community in the Russian River, CA. #Dachshund #RioNido #books #russianriver
Container Networking @ Microsoft | Cilium CNCF Maintainer | Previously @ Datadog
Followed by Jerry Chen. Unfollowed by most others. 2020 Posters to not look out for top 100 candidate.
Software Developer & Technologist
Equal Experts, ex-Thoughtworks, ex-VMWare
Louisville / San Francisco
(he/him)
nyc, software eng, early datadoghq.com, turntable.fm et al, he/him, read my blog at jmoiron.net
Building @ocuroot.com, open source, distributed CI/CD with no YAML! Posts bi-weekly on https://thefridaydeploy.substack.com/, occasional public speaker.
Serverless, databases, and serverless databases at AWS. Views my own.
Check out my blog: https://brooker.co.za/blog/
cofounder/CTO @honeycombio, co-author of Observability Engineering and Database Reliability Engineering. I test in production and so do you. 🐝🏳️🌈🦄