New to Honeycomb? Get your free account today.
Observability and monitoring are not about gathering different data—they differ in their purpose, but share the same data.
Monitoring is focused on notification based on predefined questions. Whether that’s through Dashboards people watch, or push-based alerts to notification systems like SMS or purpose-built platforms like PagerDuty. Observability, on the other hand, is focused on understanding the inner system state with a goal of debugging and affecting change in the system, either through presenting correlated information or explorative data analysis.
Both of these concepts can—and mostly do—use the same data. We can use logs, metrics and traces for either monitoring or observability, but just “having” the data, or generating it, gives you neither of them.
Most of the legacy tooling focuses on dumping all the existing signal data into a platform in three separate silos (pillars), with a hope that you can build alerts and then hopefully navigate to get answers to questions later.
When we talk about observability 2.0 or ‘modern’ observability, we’re talking about focusing on making data context-rich before it goes into your backend, then deriving monitoring and observability from that rich data inside the tooling. That’s why there are observability 2.0 tools, in my opinion: these are tools that focus on giving you access to that rich data at every point for the purposes of debugging when something has happened.
I see a ton of takes on observability 2.0 being about AI or auto-instrumentation or even eBPF, but those are still really in the monitoring space. I’m all for monitoring 2.0 being about using AI to analyze data to generate alerts based on common, industry-wide, patterns. I’m 100% behind using automatic instrumentation to gather common data to enable advanced monitoring flows. My issue is that those aren’t “debugging and understanding” use cases.
We need both observability and monitoring: we need to be notified about what’s happening when it isn’t doing what we expect. We don’t pay SREs to watch TV, we pay them to help us keep systems reliable. They know what characteristics of the system should trigger to put them into investigation mode. We need rich data to be able to dig in and ask new questions.
The trick that makes observability 2.0 seem almost like magic is that it uses a single source of truth, not a single pane of glass. It uses the same data source when querying the same information. If that’s metrics (which it generally is for infrastructure) that’s great, just use that. If that’s structured logs/spans (which it generally is for application information), just use that—don’t do it twice and then jump to another signal when you want to go deeper. We correlate those sources of truth to provide a bigger picture. That’s where OpenTelemetry comes in, with its methods of providing context to each type of signal data.
It’s all about how we use the data. Remember that. Everything else is implementation details.