Imagine a universe in which a massively multiplayer online role-playing game (MMORPG) sets Guinness World Records for the size of its online space battles—and that game is built on 20-year-old code. Well, imagine no more. Welcome to the world of EVE Online, where hundreds of thousands of players interact across 7,800+ star systems and participate in more than one million daily market transactions. As you might guess, updating and maintaining this codebase without interrupting game play could pose quite a challenge.
Nick Herring, Technical Director of Infrastructure at CCP Games—the company behind EVE Online—joined us for a technical session. He shared how Honeycomb observability is critical to EVE Online’s continued evolution, their migration from monolith to microservices with Quasar, and from on-premises to cloud.
Finding needles in the monolith haystack
“In the beginning, EVE Online was a Microsoft SQL Server, what we called a sole node, which is the location node that deals with the actual simulation.” Nick explained. “Then there were proxy nodes, various other versions of sole nodes and other services, and proxies that clients connected to.”
From there, EVE Online evolved to include OAuth2, federated logins, and a vast ecosystem of APIs used to connect the mechanisms that manage player-created alliances, corporations, and killboards to the game. “The way we built up these pieces was more around the predictability of the topology. That’s how the original network ecosystem evolved itself,” said Nick. This network shaped EVE Online to function more like a traditional application, with most API interactions based on requests and responses, instead of streaming.
Observability in those days involved using Prometheus to monitor EVE Online’s Stackless Python architecture. “Devs would basically pry into the error processing and say, ‘This database call is taking too long’, or ‘this certain Python function is taking too long.’ If you were lucky, you could figure out where that was from. There was a lot of institutional knowledge around how to read those chicken bones,” said Nick.
That’s when CCP Games decided to bring in Honeycomb for its high-cardinality observability that enables granularity at scale. “We applied Honeycomb and right away we could see what was going on,” said Nick. “Instead of just wondering why a call was taking 30 seconds, the team could find the answer with Honeycomb.”
Honeycomb’s deeper level of tracing enabled the team to understand how quickly messages were being processed and where they were getting stuck. “Before, we basically got into a neighborhood—but tracing allowed us to pick a house and a room,” said Nick.
Adventures in microservices
The next chapter in EVE Online’s evolution was a move from monolith to microservices with the creation of Quasar. “Ultimately what we call Quasar is a collection of different technologies. The beginning of this is when we removed the XML and hypermedia RESTful APIs and replaced them with Swagger spec and OpenAPI-based,” explained Nick. “We landed on, ‘We need to do something where we have more control over how we’re deploying, how we control traffic, what resources we use, and how quickly we can react.’” Using RabbitMQ for communications between microservices and gRPC as a framework for building APIs, Quasar offers a more flexible mechanism for communicating that enables easier expansion of the player ecosystem.
Not surprisingly, that expansion led to the need for more traffic management. “It got to the point where we couldn’t possibly reason or see what we needed to see to understand the ingress into the system and the effect it was having,” Nick said. “We realized we can’t use a counter and Prometheus for this because our message types keep growing and growing and growing—and that just falls over almost immediately.”
Honeycomb’s tools make modern systems built on microservices easier to observe because its datastore and query engine enables users to detect patterns across billions of requests in under three seconds, even with highly unique and granular data where problems lurk behind any arbitrary combination of attributes. “When you get your brain around tracing, it all works out once you follow some simple rules. It makes complex systems straightforward very quickly,” Nick noted.
Honeycomb makes cloud deployments clear
Along with the move from monolith to microservices, EVE Online moved from on-premises infrastructure and Google Pub/Sub to Amazon Web Services (AWS). The ability to use Honeycomb tracing to visualize both issues and microservices boundaries during a cloud migration was instrumental in helping the team iterate quickly and deploy new features more gracefully. “Having people understand that they can deploy now is huge to what we’re doing,” shared Nick.
Change the culture, change the future
Of course, successful migrations are also about culture change. Honeycomb has played a huge role in shifting CCP Games’ dev culture. For example, with Quasar, fixes can now be deployed in 30 minutes rather than 24 or more hours. “When you can fix things quickly, discovering them quickly becomes vital to that feedback loop,” said Nick. This is where Honeycomb tracing comes into play.
Honeycomb is also helping CCP Games’ devs become more comfortable with owning only their part of the microservices environment. “A big part of the cultural piece is the ability to build teams that can rely on one another by understanding where those boundaries are. When you can visualize those with tracing, that’s huge.”
With two successful migrations under its belt, it’s anyone’s guess where EVE Online will go next. Wherever the future takes it, Honeycomb will be along for the journey. “People ask, ‘Where’s Honeycomb in the new architecture?’” concluded Nick. “The answer is it’s everywhere. It’s built in.”
Want to explore the Honeycomb universe? Check out our blog page. If you want to give Honeycomb a try, sign up to get started.