Triggers let you receive notifications when your data in Honeycomb crosses the thresholds you configure. The graph on which to alert is as flexible as a Honeycomb query, which helps reduce false positives due to known errors.
When a trigger fires, you’ll be notified via the configured method. Currently supported are PagerDuty, Slack, and Email. The notification includes a link back to the graph showing you the current status, providing a jumping off point for further investigation.
Triggers may be scheduled to run on one minute intervals, from once per minute to once per day (1440 minutes.) The frequency period also determines the time range for each execution of the trigger.
For example: a trigger every 15 minutes will examine the past 15 minutes of activity to compute the values against which the thresholds are measured.
Important: trigger frequency must be specified in whole minutes, from 1 to 1440. Decimal values are truncated to the preceding full minute. (“3.6” becomes “3.”)
For this example, we want to know whenever the 95th percentile of our API server’s requests exceeds 30ms, but we want to exclude the
/poll endpoint because it has long-held connections, which pollute the data by being artificially high.
Start the trigger creation process by constructing the query on which you wish to configure a trigger.
From the Graph Settings tab, select “Make Trigger”
Fill in the details for the trigger. Both the Name and Description will be included in notifications about the trigger. Make sure the name describes clearly what has happened, while the description should indicate next steps or include links back to documentation.
The threshold indicates what condition generates a notification, and the frequency determines how often to check for that condition. Consider what is normal within your frequency window so notifications only capture conditions worth alerting on.
The sample graph displays the most recent 16 periods for your query (with a granularity equal to the period length) to help choose an appropriate threshold. For example, the graph for a 5 minute frequency shows the previous 80 minutes with a 5 minute granularity.
Recipients are the targets the trigger will notify when the measured value crosses the configured threshold. Recipients are configured on a per-trigger basis but exist on a team basis—any recipients you configure for a trigger will be available for all other triggers configured in datasets under the same team.
Choose from an existing recipient or create a new one. Honeycomb will remember recipients entered for any trigger within your Team. You only have to enter your PagerDuty API key or Slack webhook URL once—from then on they’ll be available to choose when building a new Trigger.
Three types of recipients exist: Pagerduty, Slack, and email. This shows the trigger configured to send a Slack PM to
@ben and an email to
Get your webhook URL by going to Slack’s Incoming Webhooks documentation and clicking on the cue to set up an “incoming webhook integration”. This will bring you to a page that lets you configure new integrations for your Slack organization.
Choose a default channel, though you will be able to override this channel when configuring a recipient for each trigger. When you submit, you’ll be handed a URL that looks something like
After you’ve configured the Slack recipient with your webhook URL, you can send alerts to a different channel by choosing
Add new channel from the
+ Slack button, then specifying it as
#channel or an individual via private message (
PagerDuty’s API Integration docs describe how to create a generic API integration to PagerDuty. Following those steps will give you an API key that you’ll enter in the Triggers form.
When you save the trigger, it is immediately active and will trigger at the next frequency interval (such as after 5 minutes for a 5 minute frequency.)
You can see a list of configured triggers from the Overview page. Click Overview in the top nav bar, and then Triggers in the left sidebar when looking at your dataset.
You’ll see a full list of the triggers active for the dataset. You’ll be able to click through and edit each trigger from this list.
To delete a trigger, scroll to the bottom of the edit page.
Queries done for triggers run your selected calculation over your configured frequency period.
COUNTs, for example, will be the total count for that period, not a per-second count. Averages and percentiles are likewise covering the entire period—so to detect spikes, it is better to use
MAX instead of
AVG over a period. Another alternative is to use a
COUNT with a filter restricting the set to the threshold you’re interested in—for example, you could count the number of events over 100ms and use that with a threshold instead of asking for the average to exceed 100ms.
>100ms) with a
COUNT. Your result will be the number of events that exceed your threshold.
P99calculations. These will be more representative of the majority of traffic than
AVG, which can be polluted by large outliers.
== 500, use several filters to look for events that don’t have status codes 200, 301, 302, 404, etc.