Ricardo Castro's Avatar

Ricardo Castro

@mccricardo.bsky.social

Senior Principal Engineer, tech speaker & writer, @DevOpsPorto and @DevOpsDaysPT, @CDeliveryFdn Ambassador, martial arts amateur, and metal lover. Opinions are my own. mccricardo.com

55 Followers  |  48 Following  |  797 Posts  |  Joined: 26.11.2024  |  1.8512

Latest posts by mccricardo.bsky.social on Bluesky

Preview
You Should Write An Agent They're like riding a bike: easy, and you don't get it until you try.

"You Should Write An Agent" by Thomas Ptacek

fly.io/blog/everyon...

10.11.2025 17:01 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Faster root cause for slow traces with ClickStack Event Deltas Read how ClickStack's improved Event Deltas make it effortless to pinpoint the root causes of performance outliers in observability data - turning complex trace analysis into instant, actionable…

"Faster root cause for slow traces with ClickStack Event Deltas" by Dale McDiarmid

clickhouse.com/blog/%20fast...

10.11.2025 13:01 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Revision 149 Articles and updates:

Revision 149 is out!

@koslib.com

#devops #sre #platformengineering

embracerisk.substack.com/p/revision-149

10.11.2025 10:23 β€” πŸ‘ 2    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
How Databricks Implemented Intelligent Kubernetes Load Balancing The Databricks Engineering Team needed something smarter: a Layer 7, request-level load balancer that could react dynamically to real service conditions instead of relying on connection-level routing…

"How Databricks Implemented Intelligent Kubernetes Load Balancing" by ByteByteGo

blog.bytebytego.com/p/how-databr...

09.11.2025 18:01 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Announcing Istio 1.28.0 Istio 1.28 Release Announcement.

"Announcing Istio 1.28.0"

istio.io/latest/news/...

08.11.2025 18:01 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Crossplane’s Graduation Announcement Graduation marks Crossplane’s readiness for widespread use and its evolution from a control plane framework to groundwork for intelligent, secure, and scalable cloud operations and platform…

"Cloud Native Computing Foundation Announces Graduation of Crossplane"

www.cncf.io/announcement...

07.11.2025 17:01 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

TicketOps is perfectly fine for relatively stable stuff.

At scale, it breaks.

07.11.2025 14:51 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
SQL expressions in Grafana: Combine and manipulate data from multiple sources | Grafana Labs SQL expressions are a versatile and powerful feature that opens up all sorts of creative possibilities by manipulating and combining data from different data sources.

"SQL expressions in Grafana: Combine and manipulate data from multiple sources" by Sam Jewell and Kyle Brandt

grafana.com/blog/2025/10...

07.11.2025 13:01 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

In the dawn of a new wave of AI, if you're still thinking about infrastructure as code and not infrastructure as software, you're living in the past.

07.11.2025 12:56 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

SRE is much more than just incident response.

I thought this needed to be highlighted since many are talking about "AI SRE", which mostly focuses on incident response.

06.11.2025 18:03 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
OTel Updates: Consistent Probability Sampling Fixes Fragmented Traces | Last9 One sampling decision, propagated everywhere. OpenTelemetry's Consistent Probability Sampling fixes fragmented traces across services.

"OTel Updates: Consistent Probability Sampling Fixes Fragmented Traces" by Anjali Udasi

last9.io/blog/consist...

06.11.2025 13:01 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

Consistency is underrated.

Many people believe in a "big bang" event that propels their career. And while there are certain cases where that's true, consistency is usually a better investment of your time.

Invest in being consistent and you'll reap rewards.

05.11.2025 18:02 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Introducing Agent HQ: Any agent, any way you work At Universe 2025, GitHub's next evolution introduces a single, unified workflow for developers to be able to orchestrate any agent, any time, anywhere.

"Introducing Agent HQ: Any agent, any way you work" by Kyle Daigle

github.blog/news-insight...

05.11.2025 17:01 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
How to Use AWS CloudWatch Application Signals with OpenTelemetry on ECS Fargate and Lambda This guide shows how to connect CloudWatch Application Signals with OpenTelemetry. See simple steps for ECS Fargate and Lambda. Example code included. Get clear metrics and traces fast.

"Effortless Observability - Integrating CloudWatch Application Signals with OpenTelemetry" by Tobias Schmidt

awsfundamentals.com/blog/cloudwa...

05.11.2025 13:01 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Go and enhance your calm- demolishing an HTTP:2 interop problem HTTP/2 implementations often respond to suspected attacks by closing the connection with an ENHANCE_YOUR_CALM error code. Learn how a common pattern of using Go's HTTP/2 client can lead to unintended…

"Go and enhance your calm: demolishing an HTTP/2 interop problem" by Lucas Pardue and Zak Cutner

blog.cloudflare.com/go-and-enhan...

04.11.2025 17:04 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
From Signals to Reliability: SLOs, Runbooks and Post-Mortems Build reliability with SLOs, runbooks and post-mortems. Turn observability into systematic incident response and learning. Practical examples for Kubernetes environments.

"From Signals to Reliability: SLOs, Runbooks and Post-Mortems" by Fatih KoΓ§

fatihkoc.net/posts/sre-ob...

04.11.2025 13:02 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

Reliability, like any other feature, needs to be prioritised accordingly.

There will be times where reliability work will be the priority. Other times, product features will be the priority.
And so on.

If one topic massively overshadows all the others, problems will arise.

03.11.2025 18:03 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Preview
Quick thoughts on the recent AWS outage AWS recently posted a public write-up of the us-east-1 incident that hit them this past Monday. Here are a couple of quick thoughts on it. Reliability β†’ Automation β†’ Complexity β†’ New failure modes …

"Quick thoughts on the recent AWS outage" by Lorin Hochstein

surfingcomplexity.blog/2025/10/25/q...

03.11.2025 17:02 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

I've seen a few of those and I've built a few as well 😜

Once, my Tech Lead at the time, architected an 8 microservice system for something not complex that our company wasn't even sure we were going to pursue, and that, at most, would have a couple of hundred users.

03.11.2025 14:52 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

For platforms to be valuable they need to be force multipliers.

That means being more than the sum of its parts.

03.11.2025 13:02 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

You always need to take roles and titles with a grain of salt.

I often meet DevOps/SREs/PlatEng all doing very similar jobs.

I also often meet groups of DevOps doing quite different jobs. The same applies for SREs and PlatEngs.

Context is crucial.

31.10.2025 18:03 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

Some people look down on or think of quality assurance and security as annoyances.

In the age of AI, if they continue to have that perspective, they'll have a rude awakening.

31.10.2025 13:04 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

Important: hire adults.

Also important: treat them like adults.

31.10.2025 12:50 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

Strive for civil discourse on your teams.

Some of the most creative solutions I've seen were born from discussions between people with completely different views on how to approach a problem.

Promoting diversity lays a good foundation for this to happen organically.

31.10.2025 09:46 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

People that say "that's a DevOps team problem" have absolutely no clue what DevOps is about.

30.10.2025 18:02 β€” πŸ‘ 1    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

For complex issues, I like runbooks because they allow me to really understand the problem before trying to automate it.

In the long-run, for most issues, I strive for automation. But starting with runbooks allows me to understand the quirks before automation.

30.10.2025 13:05 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

More often than not, when people reach out to me at events to ask "should I use Kubernetes", the answer is "no".

That's because, usually, people approach it from the tech side, not a problem they need fixing.

Focus on the problem and only apply tech that helps you address it.

29.10.2025 18:06 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Video thumbnail

My face when I hear people say Platform Engineering replaces DevOps.

Let's be clear, Platform Engineering *enables* DevOps.

If it doesn't, something's wrong.

29.10.2025 14:25 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

Whether you like it or not, reliability and security aren't non-functional requirements.

They're features!

Imagine storing your money in a non-secure bank.

And, as features, they need to be prioritized accordingly.

28.10.2025 18:05 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0
Post image

The best on-call is when you don't get called.

For that to happen, you need to put some serious effort into it.

28.10.2025 16:53 β€” πŸ‘ 0    πŸ” 0    πŸ’¬ 0    πŸ“Œ 0

@mccricardo is following 20 prominent accounts