The OTTL Cookbook: Common Solutions to Data Transformation Problems

OpenTelemetry

By: Tyler Helmuth | September 25th, 2024

OpenTelemetry

8 Min. Read

Contents

The scenarios, and how to solve them

OpenTelemetry + Honeycomb = ♥️

Whitepapers September 9th, 2024

How OpenTelemetry and Semantic Telemetry Will Reshape Observability

Without telemetry, we’re quite literally in the dark. The only way to understand how a piece of software is working is to look at its outputs, and the end user-facing outputs usually aren’t that useful when you’re trying to track down a bug or improve system performance across thousands or millions of users.

LEARN MORE

Note from Tyler: Last year, I wrote this blog about how to use the OpenTelemetry Transformation Language (OTTL) to transform telemetry in the OpenTelemetry Collector. With the v0.120.0 Collector release, this tool just became easier to use. I wanted to provide a new cookbook with the new recipes. The rest of this blog will be the same content as before, but with the new syntax.

As our software complexity increases, so does our telemetry—and as our telemetry increases, it needs more and more tweaking en route to its final destination. You’ve likely needed to change an attribute, parse a log body, or touch up a metric before it landed in your backend of choice.

At Honeycomb, we think the OpenTelemetry Collector is the perfect tool to handle data transformation in flight. The Collector can receive data, process it, and then export it wherever it needs to go. If you’re unfamiliar with the Collector, you can quickly review its architecture.

To transform data in the Collector, there is no better solution than the Transform processor. This processor leverages the OpenTelemetry Transformation Language (OTTL) to express telemetry transformations. In this blog, we’ll share some real-life situations that we resolved with the power of OTTL.

The scenarios, and how to solve them

Set an attribute if HTTP status code exists and does not equal 200

In this situation, the user wanted to set an attribute named otel.http.status_code to ERROR if the attribute http.response.status_code existed and did not equal 200. This is a pretty simple situation, but expressing it in a data pipeline isn’t always possible. Luckily, OTTL allows you to specify conditions that need to match before executing the transformation. In this scenario, the solution is:

transform:
  error_mode: ignore
  trace_statements:
    - set(span.attributes["otel.http.status_code"], "ERROR") where span.attributes["http.response.status_code"] != nil && span.attributes["http.response.status_code"] != 200

Replace all `.` with `_`

Here, the user wanted to replace all the . characters in their attribute keys with _. While it sounds simple, modifying key names in a data pipeline normally requires you to know the name of the key you want to change so you can set a new attribute, with the new name, using the old value. Doing this renaming dynamically is harder, but OTTL provides a simple solution:

transform:
  error_mode: ignore
  trace_statements:
    - replace_all_patterns(span.attributes, "key", "\\.", "_")

Formatting OTTL

Sometimes, OTTL statements can get long and would greatly benefit from some formatting. Since OTTL statements are strings, you can take advantage of YAML’s | string format:

transform:
  error_mode: ignore
  trace_statements:
    - |
          keep_keys(resource.attributes,
            [
              "http.method",
              "http.route",
              "http.url"
            ]
          )

Want to see the power of Honeycomb and OTel together? Download our guide.

download

Combine attribute values

In this scenario, the user wanted to set an attribute using the value from another attribute, but only when a different attribute matches a regex pattern. There are a lot of requirements here, but OTTL lets you handle them all in one statement:

transform:
  error_mode: ignore
  trace_statements:
    - set(span.attributes["index"], Concat(["audit", span.attributes["k8s.namespace.name"]], "-")) where IsMatch(span.attributes["k8s.container.name"], "audit.*")

Dropping metrics

OTTL conditions are useful in other processors as well, such as the Filter processor, where they are used to determine when data should be dropped. In this scenario, the user wanted to drop an entire metric based on its name and whether the metric contained any datapoint with the attribute rule_result with a value of pass. Here’s how we solved for this:

filter:
  error_mode: ignore
  metrics:
    metric:
      - 'name == "specific-name" and HasAttrOnDatapoint("rule_result", "pass")'

Remove a span’s parent span id

In this scenario, the user had a very unique goal of removing the parent span id from spans that were from a specific instrumentation. I don’t understand why this was necessary since it’s a pretty risky transformation (you’d likely end up with multiple root spans in the same trace id), but the transform processor is all about freedom:

transform:
  trace_statements:
    - set(span.parent_span_id, SpanID(0x0000000000000000)) where instrumentation_scope.name == "my-instrumentaiton-scope"

Parse JSON body into attributes

This is one of the most common scenarios users ask about. They have a log body that is a JSON string and they want to move the fields into the log attributes. The solution is:

transform:
    error_mode: ignore
    log_statements:
      - merge_maps(log.attributes, ParseJSON(log.body), "upsert") where IsMatch(log.body, "^\\{")
      - flatten(log.cache)
      - merge_maps(log.attributes, log.cache, "upsert")

Reuse a condition

OpenTelemetry data is structured, and there is a lot of value you can get from that structure. For example, spans all have a Span Kind, which describes the relationship between the span, its parents, and its children in a trace. In this scenario, the user wanted to rename an attribute for when the span kind was SERVER. The Transform processor has a conditions option that lets you define conditions to use for all statements. This saves you from having to duplicate the condition multiple times. Here’s the solution:

transform:
  error_mode: ignore
  trace_statements:
    - conditions:
        - span.kind == SPAN_KIND_SERVER    
      statements:
        - set(span.attributes["http.route.server"], span.attributes["http.route"])
        - delete_key(span.attributes, "http.route")

Set a resource attribute using a datapoint attribute

The user had Kubernetes IP information on the datapoint attributes that they wanted to move to the resource attributes so that the k8sattributes processor would work correctly. The Transform processor is able to do this (although there are some caveats that have only been solved for logs) using a simple set command:

transform:
  metric_statements:
    - set(resource.attributes["k8s.pod.ip"], datapoint.attributes["k8s_pod_ip"])

Use resource attributes in a log condition

In this scenario, the user wanted to drop a log if the body contained “info” anywhere in the string and if the log was from a specific Kubernetes namespace and app. They already associated the Kubernetes data to their log via the k8sattributes processor, which meant they could use the Filter processor to drop the data. The solution was:

filter:
  error_mode: ignore
    logs:
      log_record:
        - IsMatch(body, ".*info.*") and resource.attributes["namespace"] == "my-system" and resource.attributes["app"] == "my-app"

Drop specific datapoints

Here, the user wanted to drop datapoints from a metric named sample if the datapoint had an attribute named test that does not equal fail. The Filter processor allows you to access the metric name in the same way the log statement in the previous scenario could access the resources:

filter:
  error_mode: ignore
  metrics:
    datapoint:
      - metric.name == "sample" and attributes["test"] != "fail"

Index a map and slice

The user needed to access a value within a JSON map they had parsed. OTTL’s grammar allows you to index a map or slice, assuming the underlying datatype actually is a map or a slice. In this case, the solution was:

transform:
  error_mode: ignore
  log_statements:
    - merge_maps(log.cache, ParseJSON(log.body), "upsert") where IsMatch(log.body, "^\\{")
    - set(log.attributes["test"], "pass") where log.cache["body"] != nil and log.cache["body"]["keywords"] != nil and log.cache["body"]["keywords"][0] == "success"

Convert all resource attributes to attributes

The user wanted to move all their resource attributes to their log attributes. Unlike the resource-setting scenario before, there are no caveats when moving resources attributes “down” into a span. The solution was:

transform:
  error_mode: ignore
  log_statements:
    - merge_maps(log.attributes, resource.attributes, "upsert")
    - delete_matching_keys(resource.attributes, ".*")

Create a `time_bucket` attribute for slow and fast spans

The user wants to use the duration of spans in another pipeline stage. OpenTelemetry’s fields include the start and end timestamps in nanoseconds, but we want to quantify the speed as fast, mid, or slow.

transform:
  traces_statements:
      - set(span.attributes["tp_duration_ns"], span.end_time_unix_nano - span.start_time_unix_nano)
      - set(span.attributes["time_bucket"], "fast") where span.attributes["tp_duration_ns"] < 10000000000
      - set(span.attributes["time_bucket"], "mid") where span.attributes["tp_duration_ns"] >= 11000000000 and span.attributes["tp_duration_ns"] < 120000000000
      - set(span.attributes["time_bucket"], "slow") where span.attributes["tp_duration_ns"] >= 120000000000

OpenTelemetry + Honeycomb = ♥️

If you’d like to read more about the power of OpenTelemetry combined with Honeycomb, we have three great resources for you:

Read our whitepaper: The Director’s Guide to Observability: Leveraging OpenTelemetry in Complex Systems

Download our guide: Honeycomb & OpenTelemetry for In-Depth Observability

Aaand another whitepaper: How OpenTelemetry and Semantic Telemetry Will Reshape Observability

Don’t forget to share!

Tyler Helmuth

Staff Software Engineer

Tyler is a software engineer with a passion for observability and helping users start their observability journey. He is an active contributor to OpenTelemetry, where he strives to make observability easy for all to achieve.

Davin Taddeo | Mar 26, 2025

Better CloudWatch Metrics in Honeycomb with the OpenTelemetry Collector

CloudWatch metrics can be a very useful source of information for a number of AWS services that don’t produce telemetry as well as instrumented code. There are also a number of useful metrics for non-web-request based functions, like metrics on concurrent database requests. We use them at Honeycomb to get statistics on load balancers and RDS instances. The Amazon Data Firehose is able to export directly to Honeycomb as well, which makes getting the data into Honeycomb straightforward.

Metrics OpenTelemetry

Rox Williams | Feb 26, 2025

OpenTelemetry Metrics Explained: A Guide for Engineers

OpenTelemetry (often abbreviated as OTel) is the golden standard observability framework, allowing users to collect, process, and export telemetry data from their systems. OpenTelemetry’s framework is organized into distinct signals, each offering an aspect of observability. Among these signals, OpenTelemetry metrics are crucial in helping engineers understand their systems. In this blog, we’ll explore OpenTelemetry metrics, how they work, and how to use them effectively to ensure your systems and applications run smoothly.

Metrics OpenTelemetry

Austin Parker | Feb 24, 2025

OpenTelemetry Is Not “Three Pillars”

OpenTelemetry is a big, big project. It’s so big, in fact, that it can be hard to know what part you’re talking about when you’re talking about it! One particular critique I’ve seen going around recently, though, is about how OpenTelemetry is just ‘three pillars’ all over again. Reader, this could not be further from the truth, and I want to spend some time on why.

OpenTelemetry

All-in-one Observability

Why Honeycomb

Looking for something?

Our mission

The OTTL Cookbook: Common Solutions to Data Transformation Problems

How OpenTelemetry and Semantic Telemetry Will Reshape Observability

The scenarios, and how to solve them

Set an attribute if HTTP status code exists and does not equal 200

Replace all `.` with `_`

Formatting OTTL

Combine attribute values

Dropping metrics

Remove a span’s parent span id

Parse JSON body into attributes

Reuse a condition

Set a resource attribute using a datapoint attribute

Use resource attributes in a log condition

Drop specific datapoints

Index a map and slice

Convert all resource attributes to attributes

Create a `time_bucket` attribute for slow and fast spans

OpenTelemetry + Honeycomb = ♥️

Tyler Helmuth

Related posts

Better CloudWatch Metrics in Honeycomb with the OpenTelemetry Collector

OpenTelemetry Metrics Explained: A Guide for Engineers

OpenTelemetry Is Not “Three Pillars”

Ready to get started?

The OTTL Cookbook: Common Solutions to Data Transformation Problems

How OpenTelemetry and Semantic Telemetry Will Reshape Observability

The scenarios, and how to solve them

Set an attribute if HTTP status code exists and does not equal 200

Replace all . with _

Formatting OTTL

Combine attribute values

Dropping metrics

Remove a span’s parent span id

Parse JSON body into attributes

Reuse a condition

Set a resource attribute using a datapoint attribute

Use resource attributes in a log condition

Drop specific datapoints

Index a map and slice

Convert all resource attributes to attributes

Create a time_bucket attribute for slow and fast spans

OpenTelemetry + Honeycomb = ♥️

Tyler Helmuth

Related posts

Better CloudWatch Metrics in Honeycomb with the OpenTelemetry Collector

OpenTelemetry Metrics Explained: A Guide for Engineers

OpenTelemetry Is Not “Three Pillars”

Ready to get started?

Replace all `.` with `_`

Create a `time_bucket` attribute for slow and fast spans