Configure Sumo Logic
Customer Sumo Logic Instance
You will need the following information to integrate Sumo Logic into Blameless:
- SUMO LOGIC URL
- SUMO LOGIC USERNAME
- SUMO LOGIC PASSWORD
SUMO LOGIC USERNAME == Access ID
SUMO LOGIC PASSWORD == Access Key
In order to get started, you will need to find the URL you would like to use for this integration by following the documentation: Click Here!
- You will use the corresponding API Endpoint and Place it into the Blameless field SUMO LOGIC URL
In order to get the Access ID and Access Key, follow the following documentation: Click Here!
Setting up an SLO
To start, go to the SLO Manager, and select Error Budget Policies on the left navigation pane. Create a policy by clicking the “+ New Policy” button and filling out a name and description. At this point, we can define Notification Policies (defined below) or leave this Error Budget Policy as a placeholder and create the specific Notification Policies once the SLO is better understood.
Creating a User Journey
To create an SLO, first we need to create a User Journey. A user journey is a collection of SLOs focused on a specific user workflow in your application. Select User Journeys on the left navigation pane and click the “+ New Journey” button. Enter the name, description, and owner of this User Journey and click “Next”.
Creating a Service Level Indicator
The next step is to set up your service level indicator (SLI). An SLI is a quantitative measure of an aspect of the provided level of service. If there are existing SLIs, those can be selected at this step or we can create a new SLI. To create a new SLI click the “+ Create New SLI” button.
SLI Basic Information
Enter the name and description of the SLI to capture the essence of the measurement we are defining. Next select the Service this SLI will be associated with. (A note on services: these can be different microservices or different domains within a monolith) If the Service you want isn’t available, a new Service can be created by clicking the “+ Create New Service” button and entering the name, description, and any notes relevant to the new Service.
SLI Measurement Information
Select the appropriate SLI Type for the measurement we are creating and select “NewRelic” as the data source. Enter the appropriate information based on the type of SLI that was selected.
Availability
Availability Good Metric | metric=http_server_requests http_method=post http_status_code=2* service_name=service | count |
Availability Valid Metric | metric=http_server_requests http_method=post http_status_code=(2*, 5*) service_name=service |
Latency
Latency Value Metric | metric=http_server_duration http_method=post service_name=service http_status_code=(2*, 4*, 5*) | avg |
Throughput
Throughput Value Metric | cluster=*prod* container=service metric=kube_pod_container_status_restarts_total | rate | sum |
Saving the SLI
Once each of the metric fields is populated, Blameless will validate the metric that was entered. Once all of the information has been entered and validated, click the “Next” button.
The system will begin the process of backfilling the SLI data. If this fails due to transient errors, the backfill can be restarted by navigating to the SLI and clicking the refresh button in the top right corner under the tabs. The button should be labeled “Restart backfill”.
Creating a Service Level Objective
The final step is to create a Service Level Objective (SLO). An SLO is a target value or range of values from an SLI used to determine the service level. To begin, enter the SLO name. Next we will need to enter the Reliability Target information.
Reliability Target
The Reliability Target information necessary will be different based on the type of SLI that was selected in the previous step.
Availability
Percentage of Good Request Meeting Your SLI threshold | 99.9 |
99.9% of AWS API Gateway requests must be successful
Latency
Percentage of Good Request Meeting Your SLI threshold | 99.5 |
Objective Value Latency | 500 |
Metric Unit | ms |
Threshold Violation Comparison Operator | > |
99.5% of AWS API Gateway requests must successfully complete within less than 500ms.
Throughput
Percentage of Good Request Meeting Your SLI threshold | 99.5 |
Objective Value Throughput | 1 |
Metric Unit | rpm |
Threshold Violation Comparison Operator | >= |
The AWS API Gateway should process at least 1 request per minute
Saturation
Percentage of Good Request Meeting Your SLI threshold | 99.999 |
Objective Value Saturation | .05 |
Metric Unit | GB |
Threshold Violation Comparison Operator | < |
99.999% of SumoLogic logging events should be less than .05 gigabytes per event.
SLO Status
Select the appropriate SLO Status:
- Development
- Testing
- Active
- SLO will be actively monitored
Error Budget Policy
Select the appropriate Error Budget Policy. Selecting an Error Budget Policy is necessary to receive notifications about service levels falling.
Saving the SLO
Click “Finish”
Comments
0 comments
Article is closed for comments.