Building Queries in Honeycomb

When working with a dataset, the primary control for constructing queries is the Query Builder. Below is an example of the Builder, ready for you to make changes to the query’s clauses and run it over your data by applying those changes.

The Query Builder

A query in Honeycomb consists of five clauses:

The default output for most queries will be a time series and a summary table, though the precise composition will depend on the composition of your query:

Let’s take a closer look at how to use each of these clauses, and cases in which each of them can be particularly useful. When we discuss the effects of each of these operations, events are inputs to the query (the series of raw payloads you sent that match a set of criteria). Results, on the other hand, refer to the output of a query after any applicable processing or aggregation.

Break Down

Being able to separate a series of events into groups by attribute is a powerful way to compare segments of your dataset against each other.

For example, say you’ve collected the following events from your web server logs:

Timestamp uri status_code
2016-08-01 07:30 /about 500
2016-08-01 07:45 /about 200
2016-08-01 07:57 /docs 200

You might want to analyze your web traffic in groups based on the uri (“/about” vs “/docs”) or the status_code (500 vs 200). Choosing to break down by uri would return two result rows: one representing events in which uri="/about" and another representing events in which uri="/docs". Each of these grouped results rows will be represented by a single line on a graph.

Breaking down by more than one attribute will consider each unique combination of values as a single group. Here, choosing to break down by both uri and status_code will return three groups: /about+500, /about+200, and /docs+200.

Breakdowns, paired with calculations, can often reveal interesting patterns in your underlying events—breaking down by uri, for example, and calculating response time stats will show you the slowest (or fastest) uris.

TIP: Honeycomb supports breaking down your data based on any attribute in an event, though you’ll likely receive the clearest results by choosing an attribute with an uneven distribution within your data.

Calculate

Honeycomb supports a wide range of calculations to provide insight into your events. When a breakdown is provided, calculations occur within each group; otherwise, anything calculated is done so over all matching events.

For example, say you’ve collected the following events from your web server logs:

Timestamp uri status_code response_time_ms
2016-08-01 07:30 /about 500 126
2016-08-01 07:45 /about 200 57
2016-08-01 07:57 /docs 200 82
2016-08-01 08:03 /docs 200 23

Specifying a calculation for a particular attribute (e.g. P95(response_time_ms)) means to apply the aggregation function (in this case, P95, or taking the 95th percentile) over the values for the attribute (response_time_ms) across all input events.

Defining multiple “calculate” clauses is common and can be useful, especially when comparing the outputs of each of the calculations (e.g. comparing the AVG to the P95 of some value).

Calculate: Basic Case

Scenario: we want to capture overall statistics for our web server. Given our four-event dataset described above, consider a query which contains:

These calculations would return statistics across the entirety of our dataset:

COUNT AVG(response_time_ms) P95(response_time_ms)
4 72 119.4

Calculate: Adding a Break Down

Scenario: we want to examine performance of our web server by endpoint. Given our four-event dataset described above, consider a query which contains:

Pairing a Break Down clause with a Calculate clause results in events first being grouped by uri, then calculating statistics within each group:

uri COUNT AVG(response_time_ms)
/about 2 91.5
/docs 2 52.5

This is particularly powerful when paired with an Order and a Limit to return “Top K”-style results.

Filter

Sometimes you want to constrain the events by some attribute besides time: ignoring an outlier case, for example, or isolating events triggered by a particular actor or circumstance.

For example, say you’ve collected the following events from your web server logs:

Timestamp uri status_code
2016-08-01 08:15 /about 500
2016-08-01 08:22 /about 200
2016-08-01 08:27 /docs 403

You can define any number of arbitrary constraints based on event values. Filters work in concert with the specified time range to define the events that are ultimately considered by any Break Down or Calculate clauses.

Filter: Basic Case

Scenario: we want to understand the frequency of unsuccessful web requests. Given our three-event dataset described above, consider a query which contains:

The Filter clause removes the successful event (our /about web request returning a 200) from consideration, and only counts the first and third events towards our Break Down and Calculate clauses:

uri COUNT
/about 1
/docs 1

Filter: Multiple Clauses

Scenario: we want to refine our constraints further, to span multiple attributes for each event. Stacking Filter clauses returns events that satisfy the intersection of all specified Filters. Given our three-event dataset described above, consider a query which contains:

As all three events are considered by the Filter clauses, only the first one satisfies both:

Timestamp uri status_code
2016-08-01 08:55 /about 500

Note that we return full-resolution events as the result when no Break Down or Calculate clauses are specified.

Order and Limit

Order clauses define an ordering on results rows, while Limit clauses simply limit the total number of result rows to retrieve. They can be used independently but are most powerful together, to capture the “Top K” of some set of results.

For example, say you’ve collected the following events from your web server logs:

Timestamp uri status_code response_time_ms
2016-08-01 09:17 /about 200 57
2016-08-01 09:18 /about 500 234
2016-08-01 09:20 /404 200 12
2016-08-01 09:25 /docs 200 82

You can define any number of Order clauses in a query and they will be respected in the order they’re specified.

The Order clauses available to you for a particular query are influenced by whether any Break Down or Calculate clauses are also specified. If none are, you may order by any of the attributes contained in the dataset. However, once a Break Down or Calculate clause exists, you may only order by the values generated by those clauses.

Order: Basic Case

Scenario: we just want to get a sense of the slowest endpoints in our web server. Given our four-event dataset described above, consider a query which contains:

Remember that when no Break Down or Calculate clauses are defined, we simply return full-resolution events as the result rows:

Timestamp uri status_code response_time_ms
2016-08-01 09:18 /about 500 234

The web UI also supports reordering results in the browser.

Order: Paired with Break Down and Calculate clauses

Scenario: we want to capture statistics for our web server and know what we’re looking for (long response_time_mss). Given our four-event dataset described above, consider a query which contains:

Our Break Down and Calculate queries influence what will be returned as result rows (uri and the P95(response_time_ms) for events within each distinct uri group), while the Order by determines the sort order of those results (longest P95(response_time_ms) first) and the Limit throws away any results beyond the top 2:

uri P95(response_time_ms)
/about 225.15
/docs 82

As you can see, any results referencing the event with uri="/404" was excluded from our result set as a result of its relatively low response_time_ms.

This sort of Top K query is particularly valuable when working with high-cardinality data sets, where a Break Down clause might split your dataset into a very large number of groups.


Want more examples? Ask! We’re happy to help.