Duolingo: Speaking the Language of Observability with Honeycomb 

Duolingo: Speaking the Language of Observability with Honeycomb 

10 Min. Read

In the world of digital language learning, Duolingo stands out as a beacon of innovation and user engagement. With millions of users worldwide, their platform is designed not only to teach languages, but also to create a fun and engaging learning experience. Running on the robust AWS cloud infrastructure, Duolingo manages vast amounts of data and user interactions daily.

As the company experienced rapid growth, Duolingo remained steadfast in their commitment to delivering a high-quality user experience. This dedication led to the launch of a reliability initiative, which included the formation of a specialized team focused on observability. The engineering team recognized that comprehensive observability was critical to their mission. 

To meet this commitment head-on, Duolingo adopted Honeycomb more than two years ago as the backbone of their observability strategy. This partnership has allowed the company to streamline their approach, leveraging Honeycomb’s unified insights to drive both engineering excellence and exceptional user experiences as the platform continues to grow.

Consolidating tools for unified observability

When Duolingo originally kicked off their 99.99% reliability initiative years ago, they faced significant challenges in their observability practices. The company relied on several solutions, including one primarily focused on logging, another dedicated to searches, and yet another for tracing requests across services. Each tool operated in isolation, which hindered their ability to effectively monitor and respond to system behavior.

As David Amin, Staff Site Reliability Engineer at Duolingo, aptly put it, “Our service graph looked like a plate of spaghetti where nobody really knew where their traffic was going. There were too many pieces and not enough insights. Without a cohesive view across all services, our engineering team was pivoting between different tools to troubleshoot issues and identify the contributing factors.” When incidents arose, the engineers had to piece together information from different sources, leading to delays in resolution. “We would have to dive into different platforms, trying to stitch together a narrative of what happened. It was a cumbersome process,” David reflected. 

Another noteworthy challenge was the financial burden of relying on multiple observability tools. Licensing fees and operational overhead for managing the solutions added up. Each tool not only came with its own set of expenses but also required personnel to manage and analyze data across platforms. 

Duolingo’s adoption of Honeycomb marked a significant turning point in their observability strategy. With Honeycomb, Duolingo streamlined their toolkit into a single, powerful observability solution, giving them end-to-end insights into their application codebase. As David explained, “With our Honeycomb adoption, we were able to fully sunset three tools and consolidate all of our logging.” By streamlining their observability practices, the team experienced firsthand how Honeycomb effectively answered many of the same questions posed by the previous tools, cutting through noise and amplifying valuable insights across their systems.

The transition to Honeycomb also led to significant cost savings. “By moving to Honeycomb and consolidating several tools, we saved 16% in total observability tooling costs and gained a substantially better user experience for our engineers in the process,” David noted. 

This financial victory underscored the value of Honeycomb’s unified observability approach, enabling the engineering team to focus on improving their core product offerings and enhance their development practices.

Faster investigations—by orders of magnitude

When it comes to incident response, speed is everything. Duolingo recognized the immense value that Honeycomb brought with the platform’s ability to deliver real-time insights. Coupled with its responsiveness and efficiency, it empowered the engineering team to troubleshoot issues like never before.

One of the standout features of Honeycomb is its lightning-fast query engine, which delivers results without delay, regardless of the amount of data stored. As David noted, “I can run lots of queries very quickly. Honeycomb’s performance is phenomenal.” This rapid access to insights means that engineers can address potential issues before they impact customers or become real problems.

“We use BubbleUp a lot and are big fans. Everybody’s favorite is the incident response dashboard—it’s so fast and easy to investigate. More often than not, it’s the go-to resource that lets us click straight into the trace we need, and we have our answer,” David explained. This capability to visualize complex queries and data relationships has transformed their approach to incident management. The team can now quickly identify hidden patterns and anomalies, significantly accelerating the time it takes to resolve issues. 

Additionally, by implementing enhanced data tracing and monitoring techniques with Honeycomb, Duolingo streamlined their metrics. “We were able to cut down the number of time series by 30 million, especially the high-cardinality ones, thanks to the good trace data from Honeycomb.” This reduction in unnecessary metrics not only improved storage cost efficiency but also streamlined Duolingo’s monitoring efforts, helping the team concentrate on the data that truly matters.

Enhancing release cycles with real-time feedback

Beyond incident response, Duolingo has harnessed Honeycomb’s capabilities for proactive observability. By incorporating Honeycomb into its build and release cycles, Duolingo has enhanced its deployment processes. With Honeycomb’s real-time visibility into system behavior, developers get rapid feedback loops on how their code performs—before, during, and after deployment—which is essential for informed, timely adjustments.

One of Honeycomb’s key advantages for Duolingo’s DevOps process is enabling the engineering team to understand exactly what their code is doing in real time, offering insights on latency, computational demand, and impact on user experience. “It’s been exciting to see our engineers come up with new ways to use the distributed tracing telemetry to deploy stable releases and analyze their impact. For example, by measuring latency, our team can determine if a new release branch affects application performance, cloud costs, or the learner’s experience,” shared David.

These insights empower Duolingo to make data-driven decisions about feature releases, ensuring that each update enhances the user experience without incurring unnecessary costs or performance degradation. This empowers Duolingo to focus on delivering innovation at a faster pace, driven by strategic insights that support long-term growth and stability.

Win-win with Honeycomb and OpenTelemetry

When Duolingo set out to enhance their observability, adopting OpenTelemetry alongside Honeycomb was the perfect combination. OpenTelemetry generates high-quality, flexible telemetry data while Honeycomb’s native support for it made the two a winning combination. David explained, “We combined the onboarding of Honeycomb as our observability platform with the move to OpenTelemetry as a single project. With Honeycomb’s built-in support for OpenTelemetry, it made sense to do both at the same time—and for our engineering team, it delivered a ton of value all at once.

David highlighted how OpenTelemetry’s auto-instrumentation significantly raised the bar for the team’s observability efforts. “OpenTelemetry’s auto-instrumentation was far and away better than what we had before. It allowed us to quickly establish strong telemetry coverage since full customization wasn’t necessary for our initial needs. That approach helped us get up and running very quickly with something that was pretty good,” he explained.

Honeycomb’s expertise in OpenTelemetry played a crucial role in Duolingo’s successful adoption—from initial setup to optimization. “We had people at Honeycomb to go to, always. They provided hands-on support, guiding us through the complexities of tracing, instrumentation, and defining a strategic sampling approach to reduce data noise and focus on critical insights,” David noted, adding that “They helped us define a sampling strategy and understand what’s reasonable for head and tail sampling,” making sure the data they kept was truly impactful.

With this combined support and technology, Duolingo’s engineers could dive deeper into application performance, confidently deploy stable releases, and leverage real-time insights to fine-tune their systems, knowing they were backed by Honeycomb’s expertise and robust OpenTelemetry integration.

David also praised the collaborative support from Honeycomb’s account managers and the wider user community known as the Pollinators. “Our account team has been phenomenal. Whenever we had questions, they were always ready to help. Plus, having access to the Pollinators community made it easy to connect with others, fostering a sense of collaboration and shared learning,” he said. This combination of expert guidance and community engagement has empowered Duolingo to effectively leverage both Honeycomb and OpenTelemetry, enhancing their observability and driving their engineering success.

Deepening team collaboration

One notable outcome of adopting Honeycomb has been the enhanced collaboration among the engineering groups at Duolingo. Team members can easily access insights and metrics, fostering productive discussions across various projects. “Now, when engineers talk to each other, everyone is on the same page. We’re drawing from the same Honeycomb data and speaking the same language, which simplifies collaboration and drives improvements,” David shared.

This cultural shift towards data-driven conversations has not only streamlined communication but also empowered teams to tackle challenges collectively. As David pointed out, “We’re not all experts on every service—we have hundreds of them. Honeycomb helps us identify who we need to talk to and makes it easier to investigate and resolve issues.”

The ability to leverage Honeycomb’s robust observability features has notably enhanced how teams interact. “For instance, when a product developer discovers an infrastructure dependency that’s not performing as expected, having Honeycomb’s trace data at our fingertips allows for more objective conversations about issues. This access to data eliminates ambiguity, and helps us align on the underlying problem and coordinate on the necessary solution,” David shared. 

By bridging the gap between product-facing engineers and platform teams, Honeycomb has become an invaluable resource for helping to cultivate a culture of shared ownership and accountability, where every team member feels invested in the overall reliability and performance of Duolingo’s systems.

Looking ahead to Honeycomb SLOs

Looking to the future, Duolingo is excited about expanding their use of Honeycomb Service Level Objectives (SLOs) to monitor key metrics that directly impact user experience. “We want to dig in with Honeycomb SLOs to ensure that the features our customers care about—like their cherished streak that tracks their uninterrupted days of language training—are consistently reliable.” By establishing Honeycomb SLOs that align with user needs, Duolingo is committed to upholding its reputation for delivering exceptional experiences while continually innovating within the language learning space.

As Duolingo looks to the future, harnessing Honeycomb’s advanced observability capabilities alongside a steadfast commitment to creating delightful user experiences will play a pivotal role in shaping their next chapter of growth and innovation. This synergy will empower Duolingo to not only meet but exceed user expectations, ensuring that their language-learning platform continues to evolve in exciting ways. With a clear focus on enhancing the user journey, Duolingo is well-positioned to lead the way in delivering impactful, engaging, and memorable learning experiences.


Interested in learning more?
Book a consultation with our experts.


DuoLingo and Honeycomb Logos

At a glance

About

Duolingo is the leading mobile learning platform globally. Its flagship app has organically become the world’s most popular way to learn languages and the top-grossing app in the Education category on both Google Play and the Apple App Store. With technology at the core of everything it does, Duolingo has consistently invested to provide learners a fun, engaging, and effective learning experience while remaining committed to its mission to develop the best education in the world and make it universally available.

Industry

Digital education

Products

Honeycomb platform

Use cases

Incident debugging 

Application performance monitoring

Results

  • Retired three siloed tools, empowered by Honeycomb’s unified observability
  • Saved 16% in total observability tooling costs through tool consolidation
  • Accelerated investigation times by orders of magnitude, enabling quicker resolution of issues and enhancing overall efficiency
  • Reduced the number of time series data points by 30 million, simplifying data management and analysis
  • Fostered collaboration among engineering groups supported by data-driven discussions 
  • Successfully adopted OpenTelemetry with dedicated, white-glove support from Honeycomb experts
Don’t forget to share!
Brian Chang

Brian Chang

Senior Customer Marketing Manager

Brian Chang is a Senior Customer Marketing Manager with over 12 years of hands-on customer and product marketing experience. Has been previously worked at Pivotal / VMware and Red Hat in this capacity, where he has built and launched two successful customer advocacy programs that are still active and profitable today. Prior to that, Brian has had previous success at Snapfish, Sun Micro, AT&T Broadband & CNN.

Related posts