This guide is intended to provide you with the knowledge to get up and running quickly using Blameless. This guide assumes that all integrations have already been set up during the onboarding process. If you have any additional questions, please feel free to contact your account executive or email us at firstname.lastname@example.org
How to Start an Incident
There are three ways to start a Blameless Incident:
- Through the Blameless web UI
- Through Slack via the
- Via webhooks from external sources
- To request webhooks contact your customer success manager or email us at email@example.com
With the Blameless bot installed, users can start an Incident through Slack by running Blameless slash commands from the designated channel or any channel that has the bot invited into.
When the user runs
/blameless start incident a modal will appear asking the user to select an incident type, severity level, and add a brief description of the incident. Any customizations for the selected Incident Type and Severity will be populated in the incident creation modal.
At any point if you’re unsure of what the slash commands are, typing
/blameless and pressing ENTER will show the user various commands that they can run.
Users can also create a blameless incident through the Blameless web UI which is displayed above. The required fields are the same fields required when creating an incident through the Blameless Slack bot.
Once an incident has been created, a common best practice is to start assembling an incident response team. The Blameless tool contains roles which can be assigned to users in your Slack channel so for example, you can assign someone to be an Incident Commander. Assigning a role is achieved via the Blameless slackbot “slash” command.
/blameless assign will prompt a modal asking the user to select an incident role and the user who will be assigned. Any customizations from Settings for Incident Role will be populated in the modal.
Highlighting Slack Messages
A key feature of Blameless is highlighting important Slack messages so that it is automatically added to an event timeline.
There are two ways for users to highlight important messages. The first way is by adding one of the following emoji reactions to the message you’d like to capture:
The second way to highlight in Slack is by clicking on the
More Actions option for the message you’d like to capture. From there, you’ll see the
Capture Highlight feature provided by the Blameless bot.
Capturing highlights through Slack adds these important messages and images to your events timeline making it easier for new participants of an incident to get caught up.
Actions such as assigning roles and updating status’ of the incident are automatically captured in the events timeline. Learn more about the event timeline here.
Changing an Incident’s Status
You can progress the state of a Blameless incident through the various phases by updating its status. This can be achieved via the Blameless web UI, as well as through the Slack Blameless bot. Updating an incident’s status provides a new set of tasks to be completed for each phase transitioned into.
Changing the incident type during an incident
If you change incident type during an incident (or merge incidents together), the following will occur:
- You will receive warning window confirming you wish to make the change.
- There will be an update the time-list to indicate when a type transition occurs.
- The Incident remains in the same state as when it is changed.
- All completed tasks until that point are still complete, then the new tasks are presented.
- The Timeline is preserved.
- The Postmortem and tags are overwritten.
/blameless status will prompt a modal asking the user to select a status for the incident. Users can also run
/blameless set status to <investigating|identified|monitoring|resolved> to update the status.
You can also update the status via the Blameless web UI by clicking on the status dropdown menu located in the upper left hand side of the incident’s page.
It is important to keep participants and stakeholders up to date with the progress of the incident by maintaining its status.
Assigning Postmortem Roles
Postmortems are a great opportunity to learn from an incident, and Blameless makes it easy and quick to complete postmortems through its robust set of features. The first step in completing a postmortem is assigning roles to a postmortem. Blameless allows you to assign three roles: Owner, Author, and Reviewer.
/blameless assign postmortem will prompt a modal asking the user to select a postmortem role and a user to assign to. Users can also run
/blameless assign postmortem <user> <owner|authors|reviewers> to assign a user to a role directly.
Currently the web UI only allows users to assign a postmortem owner, however future releases will enable users to assign all roles via the Blameless web UI.
Marking a Postmortem as Published
Once a postmortem has been reviewed and approved by the appropriate parties, you can update the status of the postmortem to Published.
To publish a postmortem, navigate to the postmortem for the incident. Once you’re in the appropriate postmortem, click on the postmortem state dropdown and select “Published”. Now that you’ve published the postmortem, the incident Slack channel will be archived and the data from the postmortem can be used in our Reliability Insights module.
Blameless lets customers configure a range of settings so that organizations can tailor the platform to meet their needs. To start, you’ll need to navigate to the Settings page on the side menu bar.
Incident Workflow Configuration
Users can define different workflows per incident type and severities. This allows you to configure settings related to incidents, postmortems, and integrations for different use cases.
By default, all incident types in Blameless come with four severity labels. These default labels are: SEV0, SEV1, SEV2, AND SEV3. The labels can be renamed to match your internal naming scheme from Organization settings.
Whenever an incident type is created, additional customization options are enabled. You can customize the incident type by selecting it from the Incident Workflows list of types. Some of the additional configurations for incident types include:
- Incident Naming Scheme
- Slack Groups/User Invited to Incident Types for each Severity
- Slack Channels for Incident Announcements
- Tasks for each status
- Incident tag categories
- Analysis template and custom postmortem questions
- Custom postmortem questions are defined using JSON schema
- Postmortem completion SLO in days
- Requiring postmortems
- PagerDuty integration settings
- Jira integration settings
Once you have adjusted your settings, make sure to save otherwise, as soon as you leave this section your modifications will be lost.
Users can customize postmortems to adapt the questions to their process and derive to better insights and more meaningful data.
When configuring a Postmortem you can customize two areas, the Analysis Template, and the Custom Postmortem Questions.
The Analysis Template is often used to note the qualitative insights that the incident team discusses. Some commonly asked questions users like to add to the template below are things such as: What went well? What didn’t go well? What can we improve?
The Custom Questions setting allows for you to ask more quantitative questions that are normally answered as a part of the Postmortem review process. Answers from the questions in this section are captured as data which can later be queried in the Reliability Insights module.
Custom questions are configured using JSON schema which allows for ultimate flexibility and can be configured per incident type and severity in Settings: