How Support Uses Honeycomb to Debug Honeycomb
Our support team leverages our own tools, Canvas and the Honeycomb MCP, to navigate vast and complex internal telemetry when debugging customer issues. This shift means we spend far less time on query mechanics and detective work, and customers get answers faster.

By: Sara Cave

Introducing Honeycomb MCP: Your AI Agent’s New Superpower
Watch Now
You’d think that working at an observability company means everyone knows exactly where to find everything in the data. It doesn’t. And yet, as support engineers, we’re in the telemetry every day finding answers to customer questions. We do that by pointing Honeycomb at itself. This blog explains how that actually works, and how it’s changed with the advent of Canvas and the MCP.
Everyone else’s telemetry
If you’re on a development team, you generally know your part of the system. You named the fields, you know the datasets, you’ve got a feel for where to look when something breaks.
Support is different—we’re on the front lines, and a ticket could be about literally anything and may not be very detailed. For example, someone sends a screenshot of a graph that looks off. Or they might write, “Something seems wrong with my queries,” and that’s the entire bug report.
While we definitely ask clarifying questions, it saves us a lot of time if we can take the initial information and investigate right away, even while waiting for more context from the customer.
On top of this, Honeycomb’s internal telemetry has many datasets. The system has been around for years, and naming isn’t always consistent between contexts. When we look up a team ID, it could be app.team_id, or app.team.id, or team-id. All three exist as separate fields. Multiply that across a decade of telemetry decisions, and it adds up. Each one was probably fine at the time, but when your job requires navigating all of it (not just your team’s slice), it gets heavy.
Before Canvas, the blank page problem was constant. We’d open the query builder and not know which dataset to even start with, so we’d ping a teammate. Sometimes they’d send over a query link. Sometimes they’d walk through a query they remembered using for a similar issue or question. We’d even built shared boards so people had somewhere to start. Query URLs flew back and forth in Slack all day, and it worked well enough, but we were spending a lot of our time just figuring out the mechanics of how to query before we could even get to the why.
Learn more about Honeycomb Intelligence
Connect with our experts today.
Canvas killed the blank page
Canvas has sped this up. Now, we don’t need to know the exact dataset or remember the right column name anymore. In Canvas, we describe what we’re looking for, and Canvas builds the query, runs it, and returns a visualization back to refine as needed.
Need to BubbleUp a time window where things looked bad compared to when they were healthy? Just ask. Canvas handles the query construction.
We’re still the ones driving things. We still decide which thread to pull, what to rule out, and where to look next. But all the energy that used to go into getting the query syntax right now goes into the actual investigation instead. We can run three or four queries across different datasets and follow where the data takes us, rather than spending all our effort on getting one query to work.
The research stack
Canvas works well when we’re already inside Honeycomb. But tickets don’t just need telemetry. There’s a whole research layer around every investigation.
This is where the MCP comes in. MCP (Model Context Protocol) lets an AI assistant pull context from Honeycomb and the other tools we use during an investigation. The interesting part isn’t the Honeycomb MCP by itself, though. It’s what happens when we wire up a bunch of them together.
Say a customer writes in about something weird with their triggers. We need to figure out if anyone internally has seen this before, whether engineering already knows about it, and how the feature actually works under the hood. That used to mean a lot of tabs: Slack open in one, Linear in another, the codebase somewhere, our docs somewhere else. Maybe Snowflake, too, if the data lives outside telemetry. We’d copy things between them and try to hold the whole thread in our head while jumping around.
Now the entire loop happens in one place. We start with, “What is this customer describing?” and end up at “OK, here’s the code path, engineering already has a Linear ticket, and the telemetry confirms it” without ever losing our train of thought.
What escalations look like now
In the past, when customers reported an issue that proved difficult to diagnose upfront, we might have looped in engineers earlier to determine whether to escalate—and, if so, what information to share. It was a time-consuming process, and we all know that the faster a ticket can be resolved, the better the customer experience can be.
Now, with the MCP and Canvas, we can do most of that investigation before engineering ever gets involved. On this particular ticket, we searched the codebase, found the root cause, confirmed the customer’s data was fine, and even found a previous fix for the same pattern. By the time we escalated, engineering had everything they needed to go straight to a fix. No reproducing, no back and forth, no “Which dataset is this in?”
That’s the part that changed the most. Escalations used to be the beginning of an investigation. Now, they’re closer to the end of one. The customer gets an answer faster, engineering spends less time on the detective work and receives fewer escalations, and support engineers are empowered. Everyone wins.
This isn’t just a support story
If you’ve ever had to debug a service you didn’t build, or investigate an alert for a system you’re still learning, you’ve hit the same wall we did. Canvas and the MCP aren’t just for support teams. They work for anyone who needs to investigate something in unfamiliar territory. The real win is that you can start doing useful work before you’ve memorized the whole system.