Ingesting JSON Logs From Containers With the OpenTelemetry Collector

So, how do we get JSON logs into a backend analysis system like Honeycomb that primarily accepts OTLP data? In this post, we’ll cover how to use the filelog receiver component in the OpenTelemetry Collector to parse JSON log lines from logs files, as there are a few ways to achieve this.

By: Martin Thwaites

| November 26, 2024

Logging

OpenTelemetry

Whitepapers

November 21, 2023

The Director’s Guide to Observability: Leveraging OpenTelemetry in Complex Systems

Learn More

It’s very popular to push logs, in a formatted way, to the console output of an application (sometimes referred to as stdout). Although using a push-based approach like OTLP over gRPC/HTTP is preferred and has more benefits, there are many legacy systems that still use this approach. These systems typically use a JSON output for their logs.

So, how do we get these JSON logs into a backend analysis system like Honeycomb that primarily accepts OTLP data? In this post, we’ll cover how to use the filelog receiver component in the OpenTelemetry Collector to parse JSON log lines from logs files, as there are a few ways to achieve this.

Transform processor

In OpenTelemetry Collector terms, we could “just” pull the log lines as a string of text, then use the Transform processor to parse the string and add that JSON as flattened fields. That would look something like this:

receivers:
  filelog:
    include:
    - /var/log/myapp/*.log
    include_file_name: false
    include_file_path: true

processors:
    transform/parse_json_body:
        error_mode: ignore
        log_statements:
          - context: log
            conditions:
              - body != nil and Substring(body, 0, 2) == "{\""
            statements:
              - set(cache, ParseJSON(body))
              - flatten(cache, "")
              - merge_maps(attributes, cache, "upsert")
              - set(time, Time(attributes["Timestamp"], "%Y-%m-%dT%H:%M:%S%j"))
              - set(severity_text, "TRACE") where attributes["Level"] == "Trace"
              - set(severity_number, 1) where attributes["Level"] == "Trace"
              - set(severity_text, "DEBUG") where attributes["Level"] == "Debug"
              - set(severity_number, 5) where attributes["Level"] == "Debug"

              - set(severity_text, "INFO") where attributes["Level"] == "Information"
              - set(severity_number, 9) where attributes["Level"] == "Information"

              - set(severity_text, "WARN") where attributes["Level"] == "Warning"
              - set(severity_number, 13) where attributes["Level"] == "Warning"

              - set(severity_text, "ERROR") where attributes["Level"] == "Error"
              - set(severity_number, 17) where attributes["Level"] == "Error"

              - set(severity_text, "FATAL") where attributes["Level"] == "Fatal"
              - set(severity_number, 21) where attributes["Level"] == "Fatal"

// pipelines omitted

Let’s break this down a little.

The first thing we want to do is ensure that we only run this transform on logs. This involves adding a condition, so we test that the body isn’t null and the first two characters are {“.

            conditions:
              - body != nil and Substring(body, 0, 2) == "{\""

Then we need to convert the JSON body into a Map object so that each of the properties in the JSON log body is parsed and will become its own attribute in the payload. When we make the Map object, we need to store it somewhere. We could add it to another attribute right away, but because we need to do more transformations, we can use the built-in cache attribute which acts like an attribute, but is discarded when the Transform processor finishes.

              - set(cache, ParseJSON(body))

Now, we take each property (even nested ones) from the Map of our body and move it into our main attributes. We do this with the flatten function which takes the properties and places them as a top-level attribute in the cache. We could add a prefix to each attribute using the second parameter to the function, for instance prefixing with something like “body_attributes” to distinguish them from other attributes, but that isn’t something we need in this example. Then, we can merge our cache of attributes into main attributes of the object.

              - flatten(cache, "")
              - merge_maps(attributes, cache, "upsert")

In Honeycomb, we simplify this by performing the flatten and merge_maps automatically as flattened attributes at the root when you send us an attribute that’s a Map (regardless of the original name of the Map attribute).

Finally, we then have to do some post-processing to set some of the known fields on a log. Specifically, we need the severity fields of severity_text and severity_number since, by default, the filelog receiver won’t set any of these. Since I know that the Level field in my JSON log contains the information, it’s possible to write the parsing to populate the fields.

That’s it! You’ve written some Transform configuration using OTTL functions definitions that will convert JSON strings into Maps, and then convert those Maps into attributes. This solution is agnostic of whether or not the data is coming in through the file receiver or some other mechanism. However, the Transform processor isn’t the only way to do some parsing and translation of the log data, and perhaps might not be the right way.

Filelog receiver operators

Back in 2022, observIQ donated their Stanza log agent to the OpenTelemetry project in an effort to push forward the OpenTelemetry logging specification. This became part of the filelog receiver, providing what we now call operators. These operators can perform interesting functions on the filelog data before it makes it to the processors in the pipeline, but for this post, we’re going to focus on the json_parser and its attributes for timestamp and severity information.

receivers:
  filelog:
    include:
    - /var/log/myapp/*.log
    include_file_name: false
    include_file_path: true
    operators:
    - type: json_parser
      if: body matches "^{.*}$"
      timestamp:
        parse_from: attributes.Timestamp
        layout: '%Y-%m-%dT%H:%M:%S%j'
      severity:
        parse_from: attributes.Level
        mapping: 
          info: Information
          warn: Warning

This is going to do essentially the same thing as the Transform processor configuration did, but in both a more succinct way and with a smaller performance overhead.

Let’s break this down.

First, we have the same statements as the Transform example, but we follow this with an operator setup for the json_parser. We limit this to only be applied when the body attribute starts and ends, with a curly brace indicating that it’s a stringified JSON object.

  filelog:
    include:
    - /var/log/myapp/*.log
    include_file_name: false
    include_file_path: true
    operators:
    - type: json_parser
      if: body matches "^{.*}$"

Next, we tell the parser which field on the JSON object contains the timestamp. At this point, the JSON has been moved into the attributes, so we can reference the attributes directly and also tell it the exact format of the timestamp. The layout field uses the versatile gotime format by default.

      timestamp:
        parse_from: attributes.Timestamp
        layout: '%Y-%m-%dT%H:%M:%S%j'

The most useful part of using the filelog operators is parsing severity. This is because it provides a convenient mapping syntax that allows you to provide lists of different strings that will match against the different statuses, then apply severity_text and severity_number accordingly, meaning you don’t have to worry about the mappings of text and number to ensure you’re sending the right ones.

Chaining filelog receiver operators

The final thing that’s really cool about the filelog operators approach is that you can chain them together, making it really efficient as you’re not rechecking conditions.

In the below example JSON log, there is a property called host. This is actually the hostname and the port, but wouldn’t it be better if those were separate properties? By chaining the regex_parser operator, we can use regex to create the new properties.

Let’s look at the example log line:

{
    "Timestamp": "2024-10-30T17:45:30.0300402+00:00",
    "Level": "Information",
    "MessageTemplate": "Request reached the end of the middleware pipeline without being handled by application code. Request path: {Method} {Scheme}://{Host}{PathBase}{Path}, Response status code: {StatusCode}",
    "TraceId": "81a1668f3bd064740af5e974242c3f39",
    "SpanId": "c30c15cd8637f6e6",
    "Properties": {
        "Method": "GET",
        "Scheme": "http",
        "Host": "localhost:5295",
        "PathBase": "",
        "Path": "/",
        "StatusCode": 404,
        "EventId": {
            "Id": 16
        },
        "SourceContext": "Microsoft.AspNetCore.Hosting.Diagnostics",
        "RequestId": "0HN7P1AIL6K64:00000001",
        "RequestPath": "/",
        "ConnectionId": "0HN7P1AIL6K64"
    }
}

We can add an additional property to our filelog receiver’s operators array, and then tell the json_parser to output to the new regex_parser using its id. From there, we tell it to only trigger the regex if there’s a properties.Host attribute, and if there is, use the regex against that attribute.

    operators:
    - type: json_parser
      // other config
      output: parse_hostport
    - type: regex_parser
      id: parse_hostport
      if: attributes.Properties.Host != nil
      regex: '(?P<hostname>[^,]+):(?P<port>.*)$'
      parse_from: attributes.Properties.Host

Conclusion

That’s it! The filelog receiver is a lot more powerful than it appears just from looking at it. The operators provide great ways to make our telemetry useful, and can do it in a more efficient, readable way.

Check out the filelog receiver and push the data to Honeycomb. You can then use this structured data with all the Honeycomb features like BubbleUp Anomaly Detection, advanced SLOs using high-cardinality data, and much more.

If you want to learn more about OpenTelemetry, I wrote a best practices series that you can find here:

OpenTelemetry Best Practices #1: Naming

OpenTelemetry Best Practices #2: Agents, Sidecars, Collectors, Coded Instrumentation

OpenTelemetry Best Practices #3: Data Prep and Cleansing

Read Austin Parker’s whitepaper: How OpenTelemetry and Semantic Telemetry Will Reshape Observability

Read NOW

Want to know more?

Talk to our team to arrange a custom demo or for help finding the right plan.

BOOK A CONSULTATION