This guide is intended to provide you with the knowledge to get up and running quickly using Blameless. This guide assumes that all integrations have already been set up during the onboarding process. If you have any additional questions, please feel free to contact your account executive or email us at email@example.com for more information.
How to Start an Incident
There are three ways to start a Blameless Incident:
- Through the Blameless web UI.
- Through Bots (Slack and Microsoft Teams) via the bot commands.
- Via webhooks from external sources.
Blameless web UI
The required fields in the Blameless UI are the same fields required when creating an incident through the Blameless Slack Bot.
With the Blameless bot installed, users can start an Incident through Slack by running Blameless slash commands from the designated channel or any channel that has the bot invited into or via the Microsoft Teams Bot that allows you to manage and orchestrate your incident resolution workflow natively in Microsoft Teams.
When the user runs
/blameless start incident a modal will appear asking the user to select an incident type, severity level, and add a brief description of the incident. Any customizations for the selected Incident Type and Severity will be populated in the incident creation modal.
At any point if you’re unsure of what the slash commands are, typing
/blameless and pressing ENTER will show the user various commands that they can run.
Once an incident has been created, a common best practice is to start assembling an incident response team. The Blameless tool contains roles which can be assigned to users in your Slack channel so for example, you can assign someone to be an Incident Commander. Assigning a role is achieved via the Blameless slackbot “slash” command.
/blameless assign will prompt a modal asking the user to select an incident role and the user who will be assigned. Any customizations from Settings for Incident Role will be populated in the modal.
Highlighting Slack Messages
A key feature of Blameless is highlighting important Slack messages so that it is automatically added to an event timeline.
There are two ways for users to highlight important messages. The first way is by adding one of the following emoji reactions to the message you’d like to capture:
The above bullet list contains the codes to use for the emojis. Refer to the following figure for the actual icons.
The second way to highlight in Slack is by clicking on the
More Actions option for the message you’d like to capture. From there, you’ll see the
Capture Highlight feature provided by the Blameless bot.
Capturing highlights through Slack adds these important messages and images to your events timeline making it easier for new participants of an incident to get caught up.
Actions such as assigning roles and updating status’ of the incident are automatically captured in the events timeline. Learn more about the event timeline here.
Changing an Incident’s Status
You can progress the state of a Blameless incident through the various phases by updating its status. This can be achieved via the Blameless web UI, as well as through the Slack Blameless bot. Updating an incident’s status provides a new set of tasks to be completed for each phase transitioned into.
Changing the incident type during an incident
If you change incident type during an incident the following will occur:
- You will receive warning window confirming you wish to make the change.
- There will be an update the time-list to indicate when a type transition occurs.
- The Incident remains in the same state as when it is changed.
- All completed tasks until that point are still complete, then the new tasks are presented.
- The Timeline is preserved.
- The Postmortem and tags are overwritten.
/blameless status will prompt a modal asking the user to select a status for the incident. Users can also run
/blameless set status to <investigating|identified|monitoring|resolved> to update the status.
You can also update the status via the Blameless web UI by clicking on the status dropdown menu located in the upper left hand side of the incident’s page.
It is important to keep participants and stakeholders up to date with the progress of the incident by maintaining its status.
Microsoft Teams Bot
The Blameless platform offers a Microsoft Teams integration that allows you to manage and orchestrate your incident resolution workflow natively in Microsoft Teams.
The user can install the Blameless application from two locations:
Installation from the Microsoft AppSource (preferred method for prospective Blameless customers)
Installation directly from the Microsoft Teams client (preferred method for existing Blameless customers)
Installation however, is dependent on your administrator's application installation policy.
Consult with your administrator and refer to the Microsoft Teams' Application Poilcy for more information.
Once the Blameless Microsoft Teams bot is installed, you can start an incident in Teams by mentioning the bot
@blameless from any channel that the Blameless Bot has access to.
Upon mentioning the bot, it will provide you a list of command options, including starting an incident and showing recent incidents.
Starting an Incident
- Click on "Start Incident". The bot will then prompt you to specify a type, severity, and description.
Once the incident starts, the bot will create a dedicated incident channel and display a summary.
Incident Management via the Blameless Bot
After creating an incident, it's time to work through the incident and manage it with the help of the bot, including tasks, roles, incident lifecycle, data collection and more. The dedicated incident channel will be populated with the incident summary for immediate context.
The dedicated incident channel will be populated with the incident summary for immediate context. The incident creator will then be prompted with some suggested actions to start managing the incident.
In addition to the initial suggestions, the Blameless Bot provides several commands to facilitate workflows during the incident.
Assign Incident Roles
Users can assign roles by selecting a role/user combination from the dropdowns. Upon role assignment, the bot will show a success message and display the tasks list for the current incident phase.
Set Incident Status
Users can move the incident through the various phases by selecting the incident phase they'd like to transition to.
Set Incident Type
Users can set the incident type by selecting a new incident type.
Set Incident Severity
Users can change the incident severity by selecting a new severity.
Refer to the Microsoft Teams Bot Incident Management section for more information.
Incidents via webhooks from external sources
Assigning Postmortem Roles
Postmortems are a great opportunity to learn from an incident, and Blameless makes it easy and quick to complete postmortems through its robust set of features. The first step in completing a postmortem is assigning roles to a postmortem. Blameless allows you to assign three roles: Owner, Author, and Reviewer.
/blameless assign postmortem will prompt a modal asking the user to select a postmortem role and a user to assign to. Users can also run
/blameless assign postmortem <user> <owner|authors|reviewers> to assign a user to a role directly.
Currently the web UI only allows users to assign a postmortem owner, however future releases will enable users to assign all roles via the Blameless web UI.
Marking a Postmortem as Published
Once a postmortem has been reviewed and approved by the appropriate parties, you can update the status of the postmortem to Published.
To publish a postmortem, navigate to the postmortem for the incident. Once you’re in the appropriate postmortem, click on the postmortem state dropdown and select “Published”. Now that you’ve published the postmortem, the incident Slack channel will be archived and the data from the postmortem can be used in our Reliability Insights module.
Blameless lets customers configure a range of settings so that organizations can tailor the platform to meet their needs. To start, you’ll need to navigate to the Settings page on the side menu bar.
Incident Workflow Configuration
Users can define different workflows per incident type and severities. This allows you to configure settings related to incidents, postmortems, and integrations for different use cases.
By default, all incident types in Blameless come with four severity labels. These default labels are: SEV0, SEV1, SEV2, AND SEV3. The labels can be renamed to match your internal naming scheme from Organization settings.
Whenever an incident type is created, additional customization options are enabled. You can customize the incident type by selecting it from the Incident Workflows list of types. Some of the additional configurations for incident types include:
- Incident Naming Scheme
- Slack Groups/User Invited to Incident Types for each Severity
- Slack Channels for Incident Announcements
- Tasks for each status
- Incident tag categories
- Analysis template and custom postmortem questions
- Custom postmortem questions are defined using JSON schema
- Postmortem completion SLO in days
- Requiring postmortems
- PagerDuty integration settings
- Jira integration settings
Once you have adjusted your settings, make sure to save otherwise, as soon as you leave this section your modifications will be lost.
Users can customize postmortems to adapt the questions to their process and derive to better insights and more meaningful data.
When configuring a Postmortem you can customize two areas, the Analysis Template, and the Custom Postmortem Questions.
The Analysis Template is often used to note the qualitative insights that the incident team discusses. Some commonly asked questions users like to add to the template below are things such as: What went well? What didn’t go well? What can we improve?
The Custom Questions setting allows for you to ask more quantitative questions that are normally answered as a part of the Postmortem review process. Answers from the questions in this section are captured as data which can later be queried in the Reliability Insights module.
Custom questions are configured using JSON schema which allows for ultimate flexibility and can be configured per incident type and severity in Settings: