Modernize your alerts with Service Level Objectives (SLOs)
Prioritize engineering time and get fast feedback on service reliability with actionable SLOs.
Prioritize incident response using error budgets with real impact
Honeycomb SLOs make it possible to trigger alerts on issues that matter most to the business and quickly debug them. They also help you answer important questions like, “How much monthly downtime is tolerable? What performance impact is acceptable before users are negatively impacted? Should we focus on new features or tech debt?” Define, measure, validate, and adjust engineering priorities collaboratively across your org with SLOs. SLO error budgets give teams the leeway they need to prioritize or de-prioritize production issues. They’re also useful for communicating with business stakeholders.
SLOs really tell you where to focus, based on what matters to customers.
(This) informs the team to make sound engineering decisions. We now can decide: do we work on availability of our services or do we release a new feature? It’s really that simple and so important. I love that SLOs are baked into the core product.
Director of Engineering
Actionable, explorable SLOs let you debug alerts without context-switching
Traditional SLO implementations falter in a few ways:
- They use coarse time series data that over emphasizes the impact of small issues
- They underestimate issues that don’t appear in aggregates
- They don’t actually help you identify and debug the issues impacting your SLO targets
Honeycomb SLOs use your highly-granular event data to calculate availability based on how individual customers experience your services, so they don’t miss an event.
Our SLOs also provide a debuggable interface that lets engineering teams quickly dive in to figure out where issues are occurring and how to stop them without switching tabs. BubbleUp heatmaps are integrated directly into the UI to show you exactly which events are eating into your budget. No context-switching means you can stay focused on what matters.
Select suspicious performance bottlenecks on a heatmap, expose patterns with a histogram. Triage, identify, and resolve. Blazingly fast.
Your customers are unique, treat them like individuals
Honeycomb’s focus on understanding individual customer experiences is especially highlighted in our approach to SLOs.
Approaches that use time series data to measure availability are limited to aggregating all customers into one measure: for the second that just occured, was the system “good” or was it “bad”? There may be hundreds or thousands of customer experiences buried within those aggregate time series measures that you just can’t see or respond to.
Our event-based approach means that every individual service request is evaluated against the service-level indicator (SLI) criteria you define. If even one request failed, while thousands of other simultaneous requests succeeded, you’ll know about it. When individual customer experiences matter to you, Honeycomb SLOs give you the understanding your business needs.