From NOC to Platform Engineering: My Journey Through DevOps and SRE


G'day! I still remember being a bleary-eyed support engineer at 3 AM, frantically scrolling through logs trying to understand why our production services were down. Since then, I've witnessed the evolution of infrastructure management through various roles in DevOps, Site Reliability Engineering (SRE), and now Platform Engineering.
My journey mirrors the industry's evolution, and I'd like to share how these disciplines differ yet complement each other.
The NOC Days: Reactive Firefighting
In the beginning, there was the Network Operations Centre (NOC). Our team was the first line of defence against outages. We monitored dashboards, responded to alerts, and engaged in what I now recognise as purely reactive work.
The cycle was predictable: something breaks, we fix it, rinse and repeat. Knowledge was siloed, with developers tossing code "over the wall" to operations. I spent more time on incidents than improvements, and our tooling was manual and fragmented.
The DevOps Transformation: Breaking Down Walls
When our CTO announced we were "adopting DevOps," I was skeptical. Wasn't this just a buzzword? But as I moved into this new role, I discovered DevOps was fundamentally about cultural change.
The key differences became apparent:
Culture over tools: DevOps emphasised collaboration between development and operations teams that had traditionally been separated.
Automation mindset: We began automating everything from infrastructure provisioning to testing and deployment.
Shared responsibility: Developers started caring about operability, while ops folks like me learned development practices.
We implemented CI/CD pipelines, embraced infrastructure as code with Terraform, and developers started carrying pagers. The walls were coming down, but new challenges emerged.
The SRE Chapter: Engineering for Reliability
Despite our DevOps transformation, reliability issues persisted despite our cultural improvements. Enter Site Reliability Engineering, Google's approach to service management.
As I transitioned to an SRE role, I discovered its unique characteristics:
Engineering focus: Unlike DevOps' broader cultural approach, SRE applied software engineering principles specifically to operations problems.
Error budgets: We quantified reliability targets and used error budgets to balance feature development with stability.
Toil reduction: We measured toil (manual, repetitive work) and set explicit goals to reduce it through automation.
Blameless postmortems: We developed a rigorous approach to learning from failures without assigning blame.
SRE gave us concrete practices and metrics that complemented our DevOps culture. We developed service level objectives (SLOs) and implemented sophisticated on-call rotations. Reliability improved, but teams still struggled with productivity.
Platform Engineering: Enabling Developer Self-Service
Moving into Platform Engineering has been a revelation. While DevOps brought culture change and SRE brought reliability focus, Platform Engineering is about creating internal developer platforms that enable self-service.
The distinctive elements of Platform Engineering include:
Developer experience: We obsess over making our platform intuitive and delightful for developers to use.
Golden paths: We create opinionated, well-supported workflows that make the right way the easy way.
Self-service capabilities: Engineers can provision infrastructure, deploy applications, and access resources without filing tickets.
Internal product management: We treat our platform as a product with users (developers) and stakeholders.
Now I spend my days writing Terraform modules and Pulumi components that abstract away cloud complexity. We've created a platform that lets product teams deploy to GCP and Azure without being cloud experts.
How They Fit Together
These approaches aren't mutually exclusive—they're complementary:
DevOps provides the cultural foundation and breaks down silos.
SRE brings engineering rigour and reliability focus.
Platform Engineering creates the tools and abstractions that make DevOps and SRE principles accessible to all engineers.
In my current role, I apply DevOps cultural principles and SRE reliability techniques to build platforms that democratise infrastructure capabilities.
Key Takeaways
Here's what I've learned about these disciplines:
DevOps is primarily cultural. It's about breaking down silos between development and operations, fostering shared responsibility, and creating a culture of continuous improvement.
SRE is reliability-focused engineering. It applies software engineering principles to operations problems with concrete practices like error budgets, SLOs, and toil reduction.
Platform Engineering is about developer empowerment. It creates internal developer platforms that abstract away complexity and enable self-service through golden paths.
All three complement each other. The most effective organisations adopt aspects of each approach: DevOps culture, SRE practices, and platform engineering tools.
The end goal is the same: faster, more reliable software delivery that provides value to customers.
As I look to the future, I'm excited to see how these disciplines continue to evolve. One thing's certain—I don't miss those 3 AM firefighting sessions in the NOC!
What's your experience with these approaches? I'd love to hear your thoughts in the comments below.
Subscribe to my newsletter
Read articles from Joby directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
