This operator lets you declaratively manage Grafana instances, dashboards, data sources, and alerts using CRDs in Kubernetes
➜ https://ku.bz/j31586sqq
@kube.builders.bsky.social
News and links on building infrastructure and Kubernetes clusters curated by the @Learnk8s.io team More K8s news, events, jobs → https://kube.today
This operator lets you declaratively manage Grafana instances, dashboards, data sources, and alerts using CRDs in Kubernetes
➜ https://ku.bz/j31586sqq
Kubernetes troubleshooting got you down?
Visit Komodor at KubeCon Atlanta (booth 721) for their auto-remediation platform 🚀
Get Transformers stickers, tees & enter daily raffles. Plus tickets to their legendary after-party!
https://ku.bz/s7Z6RDPld
This case study describes how the team migrated the Always service (1.5M DAU, 3M MAU) from EC2 to Amazon EKS in ~2 months with zero downtime, using a GitOps Bridge, gradual traffic cutover with Route53 weighted routing, and Helm + ArgoCD workflows
➤ https://ku.bz/-WBpJ-V9Q
This case study describes how a team migrated 30+ Kubernetes clusters from Sceptre/CDK/CloudFormation to Terraform, using iterative waves, custom import tooling, and CI-driven validation to minimize risk
➤ https://ku.bz/VdnDGhggc
This case study shows how to improve Kubernetes node and pod startup times by caching images, enabling parallel pulls, and temporarily increasing EBS volume throughput
➜ https://ku.bz/pKLLLp16z
🗣️ Rafael Brito, Principal Engineer at CloudBolt, explains how Dynamic Resource Allocation (DRA) fundamentally changes specialized hardware management and addresses GPU idle time optimization
Watch the full interview: https://ku.bz/nZKfwy9q5
This tool offers a dashboard + agent setup to collect metrics from Kubernetes clusters and visualize them in real time, while also managing workloads across multiple clusters
➜ https://ku.bz/8yJPY7NxB
This project implements an open-source registry server conforming to the OCI Distribution spec for packing, storing, and serving container images and content
➤ https://ku.bz/hWJkZxCQ1
Nydus implements a chunked, content-addressable file system over the RAFS format to speed up container startup and reduce bandwidth
It works with containerd, Kubernetes, eStargz, etc. and supports:
- lazy pulling
- deduplication
- POSIX compatibility
➜ https://ku.bz/zWfGc5R6P
https://res.cloudinary.com/learnk8s/image/upload/v1761657066/linkedin-155_cx6bkz.png
This week on the Learn Kubernetes Weekly:
✅ Scale Real-Time Video
⚠️ 7 K8s Anti-Patterns in Prod
🧠 Leases & Leader Election
⚖️ Cost & Resilience in Scheduling
🚦 Pod Priority & Preemption
⭐️ @yaml.games
Read it now: https://kube.today/issues/155
https://res.cloudinary.com/learnk8s/image/upload/v1761655294/ctf-kc_kos5ab.png
⚡ Platform Engineering CTF - KubeCon Atlanta
90min to build a production K8s IDP
Teams of 4 | Live leaderboard
🗓️ Nov 10, 12:30 PM
📍Terminus 330
Top prize: Jetson Orin Nano + $2.5K Akamai credits
https://ku.bz/k8s-idp-ctf
😱 100 spots only
This article walks through which resource metrics (CPU, memory, node utilization, idle resources, PVCs) you should monitor to optimize Kubernetes costs and spot inefficiencies early
➜ https://ku.bz/RKvQcZ1L2
🗣️ Andrew Jeffree from SafetyCulture walks through their complete migration of 250+ microservices from a fragile Helm-based setup to GitOps with ArgoCD, all without any downtime
https://ku.bz/Xvyp1_Qcv
🌟 Testkube
🎙 🎙Bart
This is a Terraform module that deploys a production-ready Kubernetes cluster on Hetzner Cloud using Talos OS, with built-in bootstrapping, HA, metrics, and networking
➜ https://ku.bz/kl6z436y_
https://res.cloudinary.com/learnk8s/image/upload/v1761570394/kubecon-games_daywgq.png
I designed two things for KubeCon Atlanta that I'm proud of:
👾 @yaml.games: 10-min quiz rounds. Same format as our Advanced K8s workshop yaml.games
🔨 Platform Engineering Challenge: Teams of 4 build a platform in 90 mins ku.bz/-Rz3DBccC
Koreo is a platform engineering toolkit for Kubernetes that simplifies configuration management and resource orchestration
It enables programmable workflows and dynamic resource materialization, automating complex processes and workflows
➤ https://ku.bz/TpyynNht7
This tutorial shows how to run Kubernetes locally using kind, then build, load, and deploy an app, mount volumes, use sidecars, test DNS and scaling operations locally
➤ https://ku.bz/P2KDpwSWp
https://img.youtube.com/vi/4wFSdNW-p4Q/hqdefault.jpg
inlets-operator is a tool that provides public TCP LoadBalancers for local Kubernetes clusters, enabling ingress to internal services without managed Kubernetes engine limitations
➤ https://ku.bz/Cn8HJr43C
This article explains how to configure kubelet resource reservations, eviction thresholds, and graceful shutdown settings to improve Kubernetes node stability and prevent crashes
➤ https://ku.bz/2CPZ9HD8G
This article tells how a pod got stuck in `Pending` because the StatefulSet needed ReadWriteMany, but AWS EBS only supports ReadWriteOnce
It shows how deleting the bad PVC and switching to ReadWriteOnce fixed everything
➜ https://ku.bz/Zg29dRHx4
Kubernetes core maintainer Antonio Ojea keynotes Nov 10 at KubeCon NA on Dynamic Resource Allocation—finally standardizing low-level networking for AI workloads
Meet him at Google Booth #200 | Full interview: https://ku.bz/ysYqxwgWN
Kubech is a tool that allows users to set kubectl contexts and namespaces per shell or terminal, enabling the simultaneous management of multiple Kubernetes clusters
➤ https://ku.bz/hpMs2-t6G
pvc-autoresizer auto-expands PVCs when free space or inodes drop below thresholds, using Prometheus-sourced kubelet metrics
It updates PVC size via CSI volume expansion, honors limits via annotations, and supports group-based initial sizing
➜ https://ku.bz/xwnlC772q
🗣️ Alex Arnell explains his data visualization and telemetry approach in Kubernetes. He emphasizes effective visualization requires consistent data attributes and OpenTelemetry's semantic conventions
Watch: https://ku.bz/Lsr8gltrH
https://github.com/STRRL/cloudflare-tunnel-ingress-controller/raw/master/static/dash.strrl.cloud.png
Cloudflare Tunnel Ingress Controller simplifies exposing Kubernetes services to the internet quickly and securely using Cloudflare Tunnel
➤ https://ku.bz/pMRcpczmz
git-sync is a simple command that pulls a git repository into a local directory
It is a perfect "sidecar" container in Kubernetes - it can periodically pull files down from a repository so that an application can consume them
➜ https://ku.bz/yPFjGW_k6
Exostellar at KubeCon booth #542!
CEO Tony Shakib showcases their Kubernetes platform for managing multi-vendor GPU infrastructure with hyperscaler efficiency. Grab Lego swag & catch Zain Malik's talk
https://ku.bz/wvS-G9FRN
https://res.cloudinary.com/learnk8s/image/upload/v1761099271/linkedin-154_vfgrel.png
This week on the Learn Kubernetes Weekly:
🧩 Troubleshooting Packet Drops
⚙️ Breaking and Fixing the EKS Autoscaler
🌐 Multi-Cluster Kubernetes
🐝 kube-proxy to eBPF
🚧 API Server Log Issues
⭐️ Heroku
Read it now: https://kube.today/issues/154
This article shows how to mimic cloud load balancer behavior in bare-metal Kubernetes using Layer 2 ARP/NDP or BGP routing (via MetalLB or Cilium) to expose `LoadBalancer` services
➜ https://ku.bz/D9BWpg4Sq
Mai Nishitani, Director of Enterprise Architecture at NTT Data demonstrates how Model Context Protocol (MCP) enables Claude to directly interact with Kubernetes clusters through natural language commands
https://ku.bz/3hWvQjXxp
🌟 Testkube
🎙 🎙Bart