Incident Review: You Can't Deploy Binaries That Don't Exist
Between 22:50 and 22:54 UTC on July 9, our capacity to accept traffic to api.honeycomb.io gradually diminished until all incoming requests started to fail. 8...
Automating Collection of Troubleshooting Data with Triggers: a How-To Guide
Everyone wants to be more efficient -- to spend less time on the tedious things, and more time on the things that move the needle....
Stop Your Database From Hating You With This One Weird Trick
Let's not bury the lede here: we use Observability-Driven Development at Honeycomb to identify and prevent DB load issues. Like every online service, we experience...
Anatomy of a Cascading Failure
In Caches Are Good, Except When They Are Bad, we identified four separate problems that combined together to cause a cascading failure in our API...
When In Doubt, Add More Spans: A Tale of Tracing and Testing In Production
Recently, Toshok was telling a story about the kind of thing he talks about a lot—improving the performance of some endpoint or page or other....
Incident Review: Caches are Good, Except When They Are Bad
Between Wednesday, April 17th and Friday, April 26th, Honeycomb had four separate periods of downtime affecting the Honeycomb API, resulting in approximately 38 minutes of...
A New Bee's First Oncall
I'm Honeycomb's newest engineer, now on my eighth week at Honeycomb. Excitingly, I did my first week of oncall two weeks ago! Almost every engineer...
Tracing and Observability for Background Jobs
Illuminating the under-loved with Honeycomb Most modern web apps end up sprouting some subset of tasks that happen in the “background”, i.e., when a user...
Support Your Customers More Effectively with Honeycomb
Customer success can be a serious differentiator and competitive advantage for companies today. Everyone wants to ship quality products to their customers faster, and the...
Heatmaps Make Ops Better
In this blog miniseries, I'd like to talk about how to think about doing data analysis "the Honeycomb way." Welcome to part 1, where I...
Postmortem: RDS Clogs & Cache-Refresh Crash Loops
On Thursday, October 4, we experienced a partial API outage from 21:02-21:56 UTC (14:02-14:56 PDT). Despite some remediation work, we saw a similar (though less...
Power to the People: Control Your Own Trigger Destiny with Webhooks
When we release something new, whether it's a new SDK or Beeline or a new feature in the UI, we'll often set a Honeycomb Trigger...
Level Up with Derived Columns: Two Neat Tricks That Will Improve Your Observability
When we released derived columns last year, we already knew they were a powerful way to manipulate and explore data in Honeycomb, but we didn’t realize just...
Level Up With Derived Columns: Bucketing Events For Comparison
When we released derived columns last year, we already knew they were a powerful way to manipulate and explore data in Honeycomb, but we didn’t...
There And Back Again: A Honeycomb Tracing Story
In our previous post about Honeycomb Tracing, we used tracing to better understand Honeycomb's own query path. When doing this kind of investigation, you typically have...