Getting JSON Logs into Honeycomb

JSON is one of the most flexible formats in the data landscape we have today, and our JSON connector is perfect for your application’s custom log data.

Unstructured text logs are so 2009; whether you’re primarily using Honeycomb, json over Logstash, or some other JSON-friendly service, pointing your existing logs at Honeycomb is simple.

Data Expectations

Our JSON connector expects to find one JSON object per line. As a default, any structure deeper than top level keys will be flattened and a string representation of the content will be used in the field.

If you want Honeycomb to unfurl nested JSON objects and flatten them into unique columns, you can enable this behavior in the Honeycomb UI under the dataset schema page. Using this setting:

Numbers will be interpreted as floats, and any non-primitive types will be serialized and stored as a JSON structure. This may change in the near future; please contact us to make your case!

Installation

Download and install the latest honeytail by running:

wget -q https://honeycomb.io/download/honeytail/linux/honeytail_1.462_amd64.deb && \
      echo '251b5fc2a249a4c20e466c928ab62d999eba7057c35a44ae3e1ba76acedcc0b5  honeytail_1.462_amd64.deb' | sha256sum -c && \
      sudo dpkg -i honeytail_1.462_amd64.deb

The packages install honeytail, its config file /etc/honeytail/honeytail.conf, and some start scripts. The binary is just honeytail, available if you need it in an unpackaged form or for ad-hoc use.

You should modify the config file and uncomment and set:

Launch the agent

Start up a honeytail process using upstart or systemd or by launching the process by hand. This will tail the log file specified in the config and leave the process running as a daemon.

$ sudo initctl start honeytail

Backfilling Archived Logs

To backfill existing data, run honeytail with --backfill the first time:

honeytail -c /etc/honeytail/honeytail.conf \
  --file /var/log/myapp/log12.json \
  --backfill

This command can also be used at any point to backfill from older, rotated log files. You can read more about our backfill behavior here.

Note: (If you’ve chosen to backfill from old JSON logs, don’t forget to transition into the default streaming behavior to stream live logs to Honeycomb!)

Timestamp parsing

Honeycomb expects all events to contain a timestamp field; if one is not provided, the server will associate the current time of ingest with the given payload.

By default, we look for a few candidate fields based on name (e.g. "timestamp", "time", etc) and handle the following time formats:

If your timestamps aren’t correctly handled by the above formats, use the --json.timefield and --json.format flags to help honeytail understand where and how to extract the event’s timestamp.

For example, given a JSON log file with events like the following:

{"color":"orange","size":3,"server_time":"Aug 12 2016, 15:12:06 -0800"}
{"color":"blue","server_time":"Sep 01 2016, 06:10:32 -0800","size":4}

The command to consume those log lines (while retaining the "server_time" field as the event’s timestamp) would look something like:

honeytail --writekey=YOUR_WRITE_KEY --dataset="API Server Logs" --parser=json \
  --file=/var/log/api_server.log \
  --json.timefield="server_time" --json.format="%b %d %Y, %k:%M:%S %z"

The --json.timefield="server_time" argument tells honeytail to consider the "server_time" value to be the canonical timestamp for the events in the specified file.

The --json.format argument specifies the timestamp format to be used while parsing. (It understands common strftime formats.)

Ultimately, the above command would would produce events with the fields (note the times below are represented in UTC; Honeycomb parses time zone information if provided).

time color size
2016-08-12T23:12:06Z orange 3
2016-09-01T14:10:32Z blue 4