Ben Explains Things: how your nginx logs can magically turn query strings into columns

If you’ve ever corresponded with us over email or Intercom, chances are you’ve chatted with Ben Hartshorne. Sometimes we see him send off pearls of wisdom and we think “damn! why is this buried in email, it should really be a blog post!”. Now it is.

SCENE: did you know that if you JUST stream your logs into Honeycomb from whatever load balancer you are running at the edge (haproxy/nginx/etc), and pass some timing information back from your app via HTTP headers, you basically have a functional full stack observability view with very little work??!? IT’S TRUE.

But most people don’t know the crazy fun tricks you can do with edge proxies like nginx. For example, a poor man’s structured data pipeline looks like this:

To: Celebrated Honeycomb Customer
From: Ben
Subject: Re: Your nginx config

Ooh, if I may suggest, as a first step, add some values to the RequestQueryKeys option in the config file (or --request_query_keys flag) to enumerate some of the query parameters in your URL structure. Or, if you're brave, change the RequestParseQuery (--request_parse_query flag) from whitelist to all (only do this on a dataset you can delete - it can quickly blow up the number of columns to something unusable).

This change will pull fields out of the query strings in your URL structure and create columns for them, allowing you to do all the filters, breakdowns, and calculations on them as you would other fields. For example, adding lang to the RequestQueryKeys parameter would create a column named request_query_lang with the value 'en' when it sees a URL like /foo/bar?lang=en. Once there, you could break down by language.

This is a super cheap way to get all sorts of information about your application in to Honeycomb without making any changes to your existing configuration. For more details on the URL parsing bits of honeytail, see https://honeycomb.io/docs/connect/agent/#parsing-url-patterns

Let us know how it goes!

… and a followup, explaining more of the landscape:

To: Most Celebrated Honeycomb Customer #1
From: Ben
Subject: Re: Re: Your nginx config

 Hooray for good timing!! Yeah, I was really hopeful for the 'all' setting that just says "take every query parameter you see and make it a column" until the terrible chaff of the internet hits your website with every webserver exploit known in hopes that one sticks and suddenly you have a hundred extra columns with names like ';cat../../../../../../etc/passwd'. That doesn't help anybody. So the alternative is to just list out the query parameter keys that you want and go from there. More annoying to config, but protects your data. I will say though, a really easy way to get a list of the most common query parameter fields is to enable all and send the data to a temporary dataset, look at the data, choose the right keys, and then delete the temp dataset.

Thanks Ben!!

Sign up and try Honeycomb today!

Have thoughts on this post? Let us know via Twitter @honeycombio.