Kubernetes is the gold standard for container orchestration at scale. While massive global companies like Google, Spotify, and Pinterest rely on Kubernetes to run their software in production, so do many small but mighty developer teams. (Full disclosure: Honeycomb joined the Kubernetes brigade last year, when we migrated some of our services.)
Kubernetes helps teams of all sizes optimize their microservices architecture by enabling seamless automated containerized app deployment, easy scalability, and efficient operations. But Kubernetes also has a reputation for being difficult to learn and complex to manage, and when you’re new to something, it’s hard to know what you don’t know. That’s where Honeycomb observability comes in—distributed traces provide real-time visibility into container operations so they can be fine-tuned for optimal performance and bugs can be found and fixed faster, paving the way to a successful delivery.
But don’t take our word for it. We recently sat down for a technical session with Jeff Zellner, Director of Engineering, and Gonzalo Maldonado, Staff Engineer, at FireHydrant. They shared how they used Honeycomb to help make the move to Kubernetes and how they rely on observability to ship stable new features customers love.
From complexity to clarity with Honeycomb observability
FireHydrant (the platform) manages the process of incident detection, response, mitigation, and postmortems. FireHydrant (the company) has a relatively small engineering team charged with managing a complex technical stack made up of lots of rails and a variety of services, some hosted and some not. “Honeycomb’s a great fit for us because it allows us to trace our way through all those various components all talking to each other,” said Jeff.
Orchestrating the move to Kubernetes
FireHydrant implemented Honeycomb as they began migrating to Kubernetes. For FireHydrant, the move to Kubernetes was mostly about scalability: being able to respond to the needs of customers, many of whom are big retailers that see spikes in demand leading up to Black Friday. In addition, as FireHydrant’s app grew over time, the team adopted more microservices, and it needed a way to orchestrate them.
“When you make the shift from more traditional architectures to container orchestration or whatever comes next, you really need good observability because the knobs that you get to change your infrastructure and the performance of your infrastructure are totally different. And so the way that you’re used to operating changes, and observability gives you the ability to know what to change and to actually watch those changes happen,” Jeff explained. “I wouldn’t make a transition like that without Honeycomb.”
For Gonzalo, Honeycomb observability was key to finding potential problems lurking in the new system before they impacted performance—and before his team even knew enough about the system to know where to look. “Honeycomb is good at unknown-unknowns, which felt essential during the migration when we didn’t know what failures to expect in our system,” he shared.
Squashing bugs with distributed tracing
Using Honeycomb’s trace view, FireHydrant leverages the power of distributed tracing to observe and monitor requests as they flow through the system in a way that’s simply not possible with logs or metrics. Because logs and metrics lack context, they can’t pinpoint potential issues, find specific problems, or make causality clear.
“When it comes to debugging, tracing gives us the ability to find a needle in a haystack. Otherwise, we’d be poring through lots of logs and looking at lots of different places for lots of different things,” Jeff said.
FYI, tracing is about more than debugging
FireHydrant also uses tracing for better visibility and to discover patterns before they lead to problems. “Tracing gives us a lot of extra dimensions,” Gonzalo said. “With metrics and logs, it’s not that obvious to see the timing of things. It’s not obvious to see, ‘Hey, this process that we’re executing, it spans three different things.’ With a trace, it’s super clear.”
Unlike logs or metrics, which require serious people power to parse, Honeycomb distributed tracing is automatic, speeding the path from data to insight. “With Honeycomb observability, you can detect things without having to explicitly set it up. With logs, it was a very manual process where we had to instrument all along the way. With tracing, we can see what actually runs, which gives us a better sense of visibility and allows us to find emerging issues,” said Gonzalo.
At one point, FireHydrant tried traditional monitoring with Datadog. While the team thought Datadog had a good system for detecting anomalies and creating correlations, it wasn’t as powerful as something like Honeycomb that uses tracing natively.
“Now, whenever we have an incident, the first thing we do is grab Honeycomb and say, ‘Okay, let’s try to find the resource and just filter by that resource to spot the issue.’ And that’s how we’re able to identify things so quickly,” Gonzalo said.
Taking the stress out of shipping new features
As FireHydrant grows, it’s committed to delivering new features that help customers manage the incident process better. For example, the team recently launched automatic incident declaration to save engineers from the burden of manually declaring that an incident is, in fact, an incident.
FireHydrant uses Honeycomb observability to help deploy those new features faster and with higher quality. Before Honeycomb, FireHydrant built a feature, evolved it, and then identified issues, which caused problems with customers. With Honeycomb, FireHydrant tests while it’s building. “Learning these tools makes us better,” said Gonzalo.
Jeff added, “Having Honeycomb allows us to make bold changes quickly. Now we can say, ‘Hey, we’re going to create this. It’s going to have its issues, but we’ll polish it.’ And once we feel comfortable in our environment, then we can ship it to customers.”
Culture change from the bottom up
Migrations of any kind—whether to new platforms like Kubernetes or new tools like Honeycomb observability—include a commitment to cultural change. FireHydrant was careful to introduce Honeycomb as a tool for unlocking new opportunities rather than as an answer to a problem the team might not even think it has. “The way we’re trying to approach this is from the bottom up. We want to say, ‘Hey, we’re serving something for you, like a sandbox where you can debug things. We’re not giving you solutions you didn’t ask for,’” Gonzalo explained. “And now our engineers are really happy because they can do things they couldn’t do before.”
Go SLO to go far
Going forward, FireHydrant plans to expand observability into the realm of user journey service level objectives (SLOs). “We want to have a way to measure performance and reliability, and Honeycomb SLOs can help us get there,” said Jeff.
Whatever comes next for FireHydrant, Honeycomb will be part of it. As Gonzalo put it, “Tools like Honeycomb allow us to move into new worlds so much quicker.”
Want to watch the webinar or learn more about observability, SLOs, and all things Honeycomb? Check out our blog. If you want to give Honeycomb a try, sign up to get started.