This document describes how to set up and enable the integration of Blameless with your PagerDuty account and how to configure the integration to work with Blameless.
Introduction
Blameless offers bi-directional integration with PagerDuty, allowing the following manual and automated operations:
Capabilities | Slack | Microsoft Teams | Configuration Option |
Manually trigger PagerDuty alerts from a Blameless incident by selecting one or more PagerDuty services, and/or PagerDuty responders, and/or PagerDuty escalation policies. | X |
X (only services) |
Global:
|
Automatically trigger one or more PagerDuty alerts at the creation of an incident. | X | X | Per incident type:
|
Automatically create Blameless incidents from PagerDuty. | X | X | Global:
|
Automatically invite paged PagerDuty users to the incident channel in Slack, or to the group chat in Microsoft Teams. | X | X | Global:
|
Automatically add PagerDuty services to the list of impacted services of a Blameless incident. | X | X | |
Show PagerDuty users who are on-call for a given service. | X | X | |
Show the details of a PagerDuty escalation policy for a given service. | X | X | |
Escalate an alert to the next escalation level for a given service. | X | n/a |
Additional built-in capabilities
- Report the health status of the Blameless/PagerDuty integration in the Blameless web UI
- Capture multiple critical PagerDuty events in the Blameless incident timeline for future analysis during Retrospectives and Reliability Insights Dashboards
- Report statistics on PagerDuty events via Reliability Insights
Note: Integrate with either PagerDuty or Opsgenie. If Opsgenie is currently enabled and you want to integrate with PagerDuty instead, the integration with Opsgenie must be disabled first, before enabling the integration with PagerDuty. Please contact Blameless support to make this change.
Getting started
Prior to enabling the integration with PagerDuty, contact the administrator for your PagerDuty account to acquire a PagerDuty API Access Key. This access key should be created specifically for the integration with Blameless.
Follow these steps to configure and enable the integration with PagerDuty:
- Get a valid PagerDuty API Access key from your PagerDuty account under
https://<your_ instance_name>.pagerduty.com/api_keys - In your Blameless account, go to the Settings > Integrations > Alerting page and click “Manage” to open the PagerDuty integration settings
- Enter the above PagerDuty API Access Key in the PagerDuty API Key field
- Toggle the "Enable PagerDuty" slider to enable the integration
- Click on "Save"
Integration health status
The health status of the integration with PagerDuty can be found by clicking on the bell icon on the left main bar in the Blameless web UI, then by clicking on the “Health Check” tab. The displayed status reflects the current health of the integration only at the time when the Notifications panel is opened.
After the integration has been completed, there could be multiple reasons for the status of the integration to no longer be healthy. Please check the Troubleshooting Guide section below to find ways to restore the integration with PagerDuty.
Setting the default alert requester
The default alert requester is a PagerDuty user which already exists in your PagerDuty account.
Blameless highly recommends using a special account known as a “service account”. This account will be dedicated to the integration with Blameless and has the least chance of accidentally being deleted from your PagerDuty account.
When a PagerDuty alert is triggered manually or automatically from Blameless, PagerDuty expects a PagerDuty user to be identified as the creator of the alert. If the creator of the incident (for automatically triggered alerts) or the user who triggered the alert manually from Slack or Microsoft Teams does not have a user identity in PagerDuty matching the user email, Blameless uses the default alert requester as the fall back PagerDuty user to create the alert in PagerDuty.
Note: All alerts triggered in PagerDuty must be associated with a PagerDuty user, which is why a default alert requester account must be configured.
Triggering Alerts
Out of the box, incident responders working with the Blameless app in Slack or Microsoft Teams can manually trigger alerts by selecting one or more PagerDuty services:
Additionally (only supported with the Blameless app in Slack), if you want to allow incident responders, working with the Blameless app in Slack to be able to trigger alerts by selecting PagerDuty responders and/or PagerDuty escalation policies, you must enable those two options in the above PagerDuty integration setting page in Blameless:
Note that your PagerDuty account needs to be upgraded to the Business and Digital Operations plans to allow Blameless users in Slack to be able to trigger alerts by selecting responders or escalation policies.
Once either or both options are enabled, Blameless users working in Slack are presented with one or more additional drop-downs (Responders and/or Escalation Policies) in the pop-up window:
Integration with the PagerDuty service catalog
After the integration with PagerDuty is enabled in Blameless, Blameless automatically pulls the list of PagerDuty services from your PagerDuty account and keeps updating the list every 3 hours.
Note: As a result of this automated synchronization, the import services button is no longer in use and will be deprecated shortly.
Webhooks
Webhooks can be configured in Pagerduty to create a Blameless incident each time a new Pagerduty alert is triggered from PagerDuty (instead of from Blameless).
The management of Webhooks in Pagerduty is centralized under a Generic Webhook v3 subscription service.
Adding Blameless Webhooks to Pagerduty
After you have configured and enabled the integration with PagerDuty following the above instructions, follow these steps to add a Blameless webhook to PagerDuty:
- In your Blameless account, go to the Settings > Integrations > Alerting page and click “Manage” to open the PagerDuty integration settings.
- Click on “Create Webhook”.
- Complete the form
- Select V3 as the webhook version.
- Provide a webhook description to recognize that it was created from Blameless.
- Select the PagerDuty service for which it should be used.
- Select the incident type to be used when automatically creating the Blameless incident.
- Select the default severity as the initial severity to be used when creating the Blameless incident.
PagerDuty events in incident timelines
Blameless records the following critical PagerDuty actions in incident event timelines:
- Triggered PagerDuty alerts
- Acknowledged PagerDuty incidents (related)
- Unacknowledged PagerDuty incidents
- Escalated incidents
- Changed incident priorities
- Changed incident status (resolved)
Each PagerDuty action in the Blameless incident timeline includes a link back to the related PagerDuty alert or incidents, including the name of the responder who triggered the action in PagerDuty. This allows you to easily correlate who did what when.
Reporting PagerDuty action statistics using Reliability Insights
All of the above PagerDuty actions captured in Blameless can be analyzed through queries and tiles in Reliability Insights to extract critical insights and trends.
As you build new tiles (tables or graphs), you can view and filter PagerDuty events matching the following event types by filtering those starting with the word PAGER:
- PAGER_TRIGGERED
- PAGER_INCIDENT_RESPONDER_ADDED
- PAGER_INCIDENT_PRIORITY_CHANGED
- PAGER_INCIDENT_ACKNOWLEDGED
- PAGER_INCIDENT_RESOLVED
- PAGER_INCIDENT_ESCALATED
Troubleshooting Guide
In the event that users encounter issues with the PagerDuty integration, refer to the following troubleshooting guide for assistance:
No PagerDuty alert is created
- Ensure stable internet connection.
- Ensure the Pagerduty integration is enabled.
- Ensure the API key is a valid API key and has the right permission (Global vs user).
- Check the Service for which the alert was triggered exists and enabled.
- Check if the user’s email who initiated the trigger alert is using a valid PagerDuty user email OR a default user email selected in the settings.
Unable to pull PagerDuty services when creating a new incident from Slack
- Ensure stable internet connection.
- Ensure the PagerDuty integration is enabled in the setting.
- Ensure the API key is a valid API key and has the right permission (Global vs user).
Events are not showing up in the incident timeline
- Ensure the PagerDuty integration is enabled and the API key is valid.
- Check if the primary webhook was created and is enabled in PagerDuty.
- Note that an account level webhook is automatically created on saving the PagerDuty settings if it doesn’t exist.
- You may remove the webhook and re-save the PagerDuty settings to create a new one.
- If the webhook was not created, maybe you reached 10 limit or account level webhooks, you may need to delete some of them.
Services recently added to PagerDuty service catalog are not appearing in Blameless
- Make sure the services in question exist in Pager-duty and are enabled.
The health status of the integration for PagerDuty is not healthy
There could be multiple reasons for the status of the integration to not be healthy:
- Ensure that you have a stable internet connection.
- Ensure that PagerDuty as an external service is not down or has been temporarily down and therefore inaccessible by Blameless. Check PagerDuty’s status page for any service interruptions: https://status.pagerduty.com/
- The PagerDuty API Access Key provided in the PagerDuty integration setting page in Blameless is no longer valid. Check if the API Access Key that was originally created in PagerDuty is still present or simply create a new API Access Key and apply it to the PagerDuty integration setting page in Blameless (see the Getting Started section for more information).
- The PagerDuty integration has been disabled in the PagerDuty integration setting page in Blameless.
For further assistance, contact Blameless support.
Comments
0 comments
Article is closed for comments.