Challenges with Implementing SLOs
A few months ago, Honeycomb released our SLO — Service Level Objective — feature to the world. We’ve written before about how to use it...
Observations on ARM64 & AWS’s Amazon EC2 M6g Instances
At re:Invent in December, Amazon announced the AWS Graviton2 processor and its forthcoming availability powering Amazon EC2 M6g instances. While the first-generation Graviton processor that...
Using Honeycomb to remember to delete a feature flag
Feature flags are great and serve us in so many ways. However, we do not love long-lived feature flags. They lead to more complicated code,...
How To Make Your Customers Happy, with Eaze
"Success is a catastrophe that you have to survive." -- CJ Silverio A couple of weeks ago I had the great pleasure of hosting CJ...
Working On Hitting a Release Cadence? CI/CD Observability Can Help You Get There
We recently sponsored our partner CloudBees' conference DevOps World & JenkinsWorld in San Francisco and our message “Observe how Customers Experience Your Build” resonated well...
Never Alone On Call
Does your organization have an on-call rotation? Several members of the Honeycomb engineering team recently hosted a live webcast about why they never feel alone...
Understand Your AWS Cost & Usage with Honeycomb
First published in August 2019. AWS bills are notoriously complicated, and the Amazon Cost Explorer doesn’t always make it easy to understand exactly where your...
Treading in Haunted Graveyards
Part 1: CI/CD for Infrastructure as Code At Honeycomb, we've often discussed the value of making software deployments early and often, and being able to...
Incident Review: You Can't Deploy Binaries That Don't Exist
Between 22:50 and 22:54 UTC on July 9, our capacity to accept traffic to api.honeycomb.io gradually diminished until all incoming requests started to fail. 8...
Toward a Maturity Model for Observability
Access to observability is becoming critical to organizations shipping software, running modern infrastructures in production, and to understanding how users are experiencing their service. To...
Automating Collection of Troubleshooting Data with Triggers: a How-To Guide
Everyone wants to be more efficient -- to spend less time on the tedious things, and more time on the things that move the needle....
A New Bee's First Oncall
I'm Honeycomb's newest engineer, now on my eighth week at Honeycomb. Excitingly, I did my first week of oncall two weeks ago! Almost every engineer...
Postmortem: RDS Clogs & Cache-Refresh Crash Loops
On Thursday, October 4, we experienced a partial API outage from 21:02-21:56 UTC (14:02-14:56 PDT). Despite some remediation work, we saw a similar (though less...
How Honeycomb Uses Honeycomb Part 8: A Bee's Life
This post continues our dogfooding series from How Honeycomb Uses Honeycomb, Part 7: Measure twice, cut once: How we made our queries 50% faster…with data....
How Honeycomb Uses Honeycomb, Part 3: End-to-End Failures
At Honeycomb, one of our foremost concerns (in our product as well as our customers’) is reliability. To that end, we have end-to-end (e2e) checks...