Making Instrumentation Extensible
Observability-driven development requires both rich query capabilities and sufficient instrumentation in order to capture the nuances of developers' intention and useful dimensions of cardinality. When...
Reflections on Monitorama 2019
This year was my third in a row attending (and now speaking at!) Monitorama. Because the organizers do a great job of turning introverts into...
Investigating Timeouts With Tracing Using Sentry
Tracing is one of the key tools that Honeycomb offers to make sense of data. Over the last few weeks, we've made a number of...
Toward a Maturity Model for Observability
Access to observability is becoming critical to organizations shipping software, running modern infrastructures in production, and to understanding how users are experiencing their service. To...
Automating Collection of Troubleshooting Data with Triggers: a How-To Guide
Everyone wants to be more efficient -- to spend less time on the tedious things, and more time on the things that move the needle....
Welcome (to) Home
Our latest product update features an intuitive home (landing) page that orients users with a quick, real-time view into what's happening right now in your...
Honeycomb’s New APM Capabilities Give Engineering & DevOps Superior Production Insights and Faster Incident Resolution
PRESS RELEASE: Read on PRNewswire Modern Dev and SRE teams gain efficiencies for proactive and collaborative debugging as software updates deploy SAN FRANCISCO, May 29,...
Dynamic Sampling by Example
Last week, Rachel published a guide describing the advantages of dynamic sampling. In it, we discussed varying sample rates to achieve a target collection rate...
Stop Your Database From Hating You With This One Weird Trick
Let's not bury the lede here: we use Observability-Driven Development at Honeycomb to identify and prevent DB load issues. Like every online service, we experience...
The New Rules of Sampling
One of the most common questions we get at Honeycomb is about how to control costs while still achieving the level of observability needed to...
Anatomy of a Cascading Failure
In Caches Are Good, Except When They Are Bad, we identified four separate problems that combined together to cause a cascading failure in our API...
When In Doubt, Add More Spans: A Tale of Tracing and Testing In Production
Recently, Toshok was telling a story about the kind of thing he talks about a lot—improving the performance of some endpoint or page or other....
Incident Review: Caches are Good, Except When They Are Bad
Between Wednesday, April 17th and Friday, April 26th, Honeycomb had four separate periods of downtime affecting the Honeycomb API, resulting in approximately 38 minutes of...
Metrics vs Events: A Conversation About Controlling Volume
If I'm used to metrics, how should I think about events in Honeycomb? This question cuts to the heart of how Honeycomb is different from...
A New Bee's First Oncall
I'm Honeycomb's newest engineer, now on my eighth week at Honeycomb. Excitingly, I did my first week of oncall two weeks ago! Almost every engineer...