Issue 140

September 2, 2021 — Service Level Metrics

When running a Kubernetes cluster, understanding the health of the services running on your clusters is job number one. Thanks to Google and their SRE handbook we have a pretty good idea of how to do this. So, without further ado let’s jump into some ways to measure health (or SLOs and SLIs).

SRE fundamentals 2021: SLIs vs SLAs vs SLOs

We will start with the experts on all things service level and get good definitions for the nuances and big difference between all these metrics. It can be a bit murky trying to understand and communicate the differences between these metrics so this is a great place to refer back to. 📈

Using observability tools to set SLOs for Kubernetes Applications

This practical guide dives into a few options and mental frameworks for thinking about your SLOs. It also gives a pretty good overview of Prometheus Grafana and even Jaeger (tracing) and how to use them for your service-level metrics. 📘

SRE Practices for Kubernetes Platforms

This quick article gives you a hit list of the metrics and things you should / could monitor as SLIs for your platform. This article really helped me wrap my head around where to start when planning out my SLIs and how to think about them. 🧠

Setting SLOs: a step-by-step guide

We head back to the experts for this in-depth playbook for how to set up your own SLOs. This one gets pretty deep pretty fast and gives you a great way to think about setting your service level metrics and how to measure them.

A guide to setting up Kubernetes Service Level Objectives (SLOs) with Prometheus and Linkerd

This is a great writeup on implementing your SLOs and setting up your Prometheus dashboards. Coming from the folks over at Buoyant, it’s not surprising, but they still make a good case for using a service mesh when setting up your service level metrics.

Implementing SLI/SLO based Continuous Delivery Quality Gates using Prometheus

Knowing about problems before they make it to production is a path to happiness for SREs. In this writeup they explain how to use Keptn along with Prometheus to shift things left using Quality Gates. 🚪

Tweet of the week

If you’re considering registering for the Contributor Summit at KubeCon, virtually or in person, register now so it doesn’t get cancelled!

Loading tweet...