How to Manage Blameless SLIs

Getting Started

Once you have the SLO components together, you need to look at managing your SLO environment using your identified Best Practices. To help support that effort, this guide describes the following actions:

  • Creating
  • Adding
  • Editing
  • Deleting
note

It is important to check your associations within the SLO components as you will not be able to delete or edit some of these components if they are inter-associated.

As a new User to SLOs

To help you, as a new user, Blameless provides you with the SLO Wizard to help guide you through the process. You start with the User Journey.

Start by launching the SLO Manager. Blameless opens to the User Journey Landing page. Next, click on “+New Journey”. The SLO Wizard will walk you through the process, and you can follow that process via the guide icon at the top of the page or by clicking the on “Next” button.

SLO Feature Nav Bar
note

You can create a User Journey and leave it blank as a placeholder for future population.

You can continue to the section “Working with the SLO Wizard” for a high level description of the feature.

For detailed instructions regarding the New User Journey and the SLO Wizard, refer to the Building a New SLO

As an Experienced User of SLOs

As an experienced user, you are probably familiar enough with the process to not need the SLO Wizard to create more SLIs, but it is certainly there for you to use to create new user journeys and add new SLOs to user journeys. You can continue on via the section, “Launching the SLO Manager”.

Working via the SLO Wizard

An SLO requires the following:

  • Create the User Journey
  • Create the SLI
  • Create the Error Budget Policy
  • Create the SLO
  • Set the Thresholds
note

The best practice for User Journey analysis is collaboration across teams and groups to collect the journey information.

Managing SLIs

Once you have SLIs set up, you connect them to your SLOs, which are targets against your SLI. These indicators are points on a digital user journey that contribute to customer experience and satisfaction.

Services

Services is the list of SLIs associated with SLOs.

Adding Services

  1. Select the Service Level Indicators option. When the Services window opens, and a list of services, if created, will appear. If no Services exist, the SLO Manager will say so.
note

If there is no SLI associated with a service under the Services Title, the SLI title in the field will be blank.

  1. Click on “+ New Service” to create a new Service.
note

You must create at least one service with at least one SLI prior to adding an SLO to a User Journey.

  1. If there is no pre-existing Service: Blameless will report none exist.
New Services windowNew SLI window

Otherwise, a new modal opens containing the following required (*) fields:

  • Service Name
  • Description
  1. Enter the name for the new service and a description.
  2. Click the “Save” button. The new service will appear on the “Services” landing screen the next time you open it. The resulting SLI is empty until you define it.
note

The SLI list will remain blank until you create an SLI and save it.

Services Details

When you open the desired Services window, you will find the following elements:

  • SLI A list of SLIs (if any exist) under an SLI tab.

  • Notes A Notes tab containing a text field where you can add information regarding the service.

  • Service Summary A summary of the following information regarding the SLI.

    • Service Description

    • Creation date

    • Last updated

    • Team

note

Both the Description and the Team (members) sections have a pencil icon, signifying these fields can be edited.

New Services window

Update Services

  1. Select an existing Service from the Services List.
  2. Click on the ellipse (three dots) and select the Edit” option. The Edit Service Modal opens.
  3. Adjust the information as desired in either (or both) required (*) fields.
  4. Click the “Save” button to update the Service.

Deleting Services

  1. Return to the Services window.
  2. Click on the ellipse (three dots) at the end of the row for the desired Service.
  3. Select the “Delete” option. Blameless will ask for confirmation to delete.
  4. Click on the “Delete” button.

SLIs

SLIs are a quantitative measure, typically provided through your APM platform. Traditionally, these refer to either latency or availability, which are defined as response times, including queue/wait time, in milliseconds. A collection of SLIs, or composite SLIs, are a group of SLIs attributed to a larger SLO.

Adding SLIs to a Service

Adding an SLI to a Service is done via the Service Level Indicators landing page.

However, creating an SLI through the SLO wizard is an additional flexibility we offer, because you must select an SLI when you create and add an SLO to a User Journey.

SLI Latency Configuration window
  1. Click on the “Define SLI” button. This will open a new SLI Details window.

  2. Assign an SLI Name (*=required) and enter a description (optional).

    For example: "This SLI measures the latency of the login request for the 95th percentile of login requests hitting the API and Login service".

SLI Latency Configuration window
  1. Select the SLI Type (*=required field). Currently supported options are:
  • Availability measures good metrics vs. valid metrics.
  • Latency measures how long it takes to complete the task.
  • Throughput measures the proportion of the time the data processing rate is faster than a threshold.
  • Saturation measures the proportion of the time your system load is less than a threshold.
  1. Select the Data source (*=required field), based on the integration(s) you activated.
  2. Copy and paste the metric shown in the example field, based on the Data source selected.
  3. Click the “Save” button. Return to the User Journey level. You can now start to set up SLOs with the Error Budget Policies.
note

Pingdom is a special integration, we can't measure internal metrics of services (which we can measure with Prometheus/Datadog/etc), we can work only with high level entities: page load time (SLI type "Latency") and status code of the page (SLI type "Availability").

note

Prometheus currently tracks a block of data based on the oldest available data received.

note

The SLI status is currently reported in two different areas:

  • SLI cards under Service Level Indicators > Service
  • In the Detailed view of each SLI.

Editing SLIs

note

You CANNOT edit an SLI if it is associated with an SLO. Blameless will throw an error message.

SLI Yes Deletion warning
  1. If there is no association, however: Select the SLO to update.
  2. Select an existing SLI form a list of Services.
  3. Click the “Next” button.
  4. Complete your edits.
  5. Click the “Save” button.

Deleting SLIs

note

You cannot delete an SLI if it is associated with an SLO(s).

  1. Open the SLI in question within the Services List.
  2. Locate the Trash can icon within the SLI Title header.
  3. Click on the Trash can. Blameless will confirm your choice with either a success warning or a decline warning window depending on the association status of the SLI.
  4. Select your desired action.
SLI No Deletion warningSLI Yes Deletion warning

For More Information

For instructions regarding the creation, configuration, and use of User Journeys, Error Budgets, SLOs, and SLIs, refer to the following SLO references:

Blameless SLO Definitions

An Introductory Guide to Blameless SLOs

Getting started with Blameless SLOs

Building a New SLO

Creating Error Budget Policies

Managing your SLOs and detailed instructions:

Understanding your SLOs

Refer to the Google SRE Handbook for more information regarding Site Reliability Engineering.