Webinars SLOs Incident Response Debugging

Actionable Service Level Objectives (SLOs) Based on What Matters Most


Summary:



Welcome to The Authors’ Cut Series
Alerts are often based on the things that are easiest to measure, so without the right tools, you might be measuring the wrong stuff. In this session, we discuss the inherent dangers of alert fatigue that are normalized in monitoring-based alerting systems and how the combination of SLOs with structured event data provides a more beneficial experience than using time-series data or aggregated counts. SLO-based alerting ensures you are responding to what is most important to your customers and business.

Topics include:
- Using Service Level Objectives for Reliability. An introduction to SLO-based alerting built on observability data, that helps avoid alert fatigue and treats user experience as the north star. (Chapter 12)
- Acting On and Debugging SLO-Based Alerts. An introduction to error budgets, how to calculate error budget burn, and why observability data is necessary for the most accurate SLO calculations. (Chapter 13)
- A Honeycomb Demo. Honeycomb’s Pierre Tessier demos how to create actionable SLOs that go straight from alert to debugging the underlying events that created the alert.

About This Series
Welcome to The Authors’ Cut series. In writing the O’Reilly Observability Engineering book, our goal is to help you achieve production excellence, based on our experiences building and operating commercial SaaS products at scale, and as creators of observability tooling for high-performance engineering teams. These are interactive sessions led by authors Charity Majors, Liz Fong-Jones, and George Miranda where you’ll discuss concepts in the book, see how to apply them in Honeycomb, and get advice on strategy and implementation in your world.