The Art of Site Reliability Engineering - Part 1

GauravGaurav
1 min read

What is Site Reliability Engineering ?

Site Reliability Engineering (SRE) is the practice of applying software engineering principles to operations and infrastructure processes to help organizations create highly reliable and scalable software systems.

SRE focuses on improving software system reliability across key categories including availability, performance, latency, efficiency, capacity, and incident response.

Vocabulary for SRE?

There are certain keywords that forms the base of SRE principles to name a few(not limited):

  • Service Level Indicators(SLI)

  • Service Level Objective(SLO)

  • Service Level Agreement (SLA)

  • Error Budget

  • Alerts

NOTE: In upcoming blogs we will go through these Vocabulary and some Opensource Observability Tools that help SRE’s to do there work.

1
Subscribe to my newsletter

Read articles from Gaurav directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Gaurav
Gaurav