Building a Secure OpenTelemetry Collector

Building a Secure OpenTelemetry Collector

6 Min. Read

The OpenTelemetry Collector is a core part of telemetry pipelines, which makes it one of the parts of your infrastructure that must be as secure as possible. The general advice from the OpenTelemetry teams is to build a custom Collector executable instead of using the supplied ones when you’re using it in a production scenario. However, that isn’t an easy task, and that prompted me to build something.

In this post, we’ll go through how to build a custom Collector, including the new way I created using the standard OpenTelemetry Collector configuration.

If you just want to see the new stuff, take a look at the repository, or read the last section of this blog.

What does OpenTelemetry provide?

The OpenTelemetry team provides Docker images you can use:

  • OpenTelemetry core
    This is a limited image and includes components that are maintained by the core OpenTelemetry Collector team. The manifest includes all the components from the base OpenTelemetry Collector repo, but also includes some of the most commonly used components from the contrib repo like filter and attribute processors, and common exporters like Jaegar and Zipkin.
  • OpenTelemetry contrib
    This is the kitchen sink version. The manifest includes almost everything from the core and contrib repos, with some omissions where the components are in development.

Why aren’t these enough?

These images include too much of an attack surface. It’s as simple as that, in my eyes. Remember, code is a liability: aim for less, not more.

These images, even the core one, have more components than anyone requires. As an example, I’ve not yet come across a user (and I talk to a lot of you) that uses the OTLP, Zipkin, and Jaeger exporters in the same Collector. Including those would be a bad security practice as you’d include potential attack vectors that you don’t need.

For example, there was a long-standing vulnerability with the Jaeger receiver that couldn’t be fixed until a Go upgrade was done. However, removing the Jaeger receiver wasn’t an option as it would break people’s environments expecting it. So, it had to be shipped.

You could use the argument that unless a component is in a pipeline it’s not executed, so it’s not a vulnerability. I’d love to see you convince security teams of that! Regardless, unless you do a code review and understand the vulnerability, you can’t definitively say that’s the case. And, that’s not something the OpenTelemetry Collector teams can do at scale.

What is the OpenTelemetry Collector builder?

The OpenTelemetry Collector team (specifically Juraci I believe) decided that creating OpenTelemetry Collector images should be easier, and people should be able choose their components. Therefore they created a tool that takes a manifest.yaml file that specifies the Go modules to include, and uses that to build a targeted distribution with a limited set of components.

A manifest looks like this:

dist:
  name: otelcol-custom
  description: OpenTelemetry Collector
  version: 0.91.0
  otelcol_version: 0.91.0
 
receivers:
  - gomod: go.opentelemetry.io/collector/receiver/otlpreceiver v0.91.0
 
exporters:
  - gomod: go.opentelemetry.io/collector/exporter/debugexporter v0.91.0
  - gomod: go.opentelemetry.io/collector/exporter/loggingexporter v0.91.0
  - gomod: go.opentelemetry.io/collector/exporter/otlpexporter v0.91.0
  - gomod: go.opentelemetry.io/collector/exporter/otlphttpexporter v0.91.0
 
extensions:
  - gomod: github.com/open-telemetry/opentelemetry-collector-contrib/extension/healthcheckextension v0.91.0
 
processors:
  - gomod: go.opentelemetry.io/collector/processor/batchprocessor v0.91.0
  - gomod: go.opentelemetry.io/collector/processor/memorylimiterprocessor v0.91.0
  - gomod: github.com/open-telemetry/opentelemetry-collector-contrib/processor/attributesprocessor v0.91.0
  - gomod: github.com/open-telemetry/opentelemetry-collector-contrib/processor/filterprocessor v0.91.0

This generates a very minimal Collector which, honestly, is what most people use. However, you can see that there is different syntax here for some of the modules. You need to know where they come from, you need to know the syntax, and there’s also some inbuilt Go knowledge to know what gomod means.

The builder itself is solid, and I use it all the time to build custom images. However, I’ve been striving for a way to make it easier in pipelines and more accessible to people.

Building a custom Collector with a two stage build

One of the ways to make using the Collector builder easier is to use a two stage build and run everything inside the first container.

FROM golang:1.21 as build
ARG  OTEL_VERSION=0.90.1
WORKDIR /app
RUN go install go.opentelemetry.io/collector/cmd/builder@v${OTEL_VERSION}
COPY . .
RUN CGO_ENABLED=0 builder --config=manifest.yaml --output-path=/app
 
FROM cgr.dev/chainguard/static:latest
COPY --from=build /app/otelcol-custom /
COPY config.yaml /
EXPOSE 4317/tcp 4318/tcp 13133/tcp
 
CMD ["/otelcol-custom", "--config=/config.yaml"]

With this, you still need to manually generate the manifest.yaml. However, you don’t need to install the builder or the Go SDK locally. This is really useful if you’re not interested in writing Go, or if you want to do things in a release pipeline. But we can do better!

Building a custom Collector with ocb-config-builder

I thought about how we could build the manifest file automatically, how we could remove the gomod problems and allow developers to pick the components they wanted instead of the Go modules.

I played with some custom yaml formats, but it all felt a little weird. Too many abstractions? Then, chatting with Tyler Helmuth about issues with the Collector builder, we had an epiphany… What if we just use the Collector config itself and map it?

This clicked with me. It’s DevEx 101 really: don’t leak your internal decisions to the user; speak their language and make it simple for them. Thus, the “OpenTelemetry Collector Builder Config Builder” was born! Yes, the name is bad. Naming is hard.

We still need to do a stage Docker build. However, we skipped the step of needing the manifest.yaml. Which also has the side benefit that now the Collector will never have unused components.

What does it look like?

FROM ghcr.io/martinjt/ocb-config-builder:latest as build
COPY config.yaml /config/config.yaml
RUN /builder/build-collector.sh /config/config.yaml
 
FROM cgr.dev/chainguard/static:latest
COPY --from=build /app/otelcol-custom /
COPY config.yaml /
EXPOSE 4317/tcp 4318/tcp 13133/tcp
 
CMD ["/otelcol-custom", "--config=/config.yaml"]

Just make sure that your Collector config file is called config.yaml (or change it in the dockerfile). and you’ll have a tightly coupled Collector executable, in a secure container using the Chainguard base image to run it in production.

Conclusion

Having a custom OpenTelemetry Collector build doesn’t have to be complicated anymore. You don’t need to understand Go, or build config files. You can include this in your pipelines as a drop-in replacement for the collector-contrib image you’re likely using in production right now.

Hopefully this helps!

Happy collecting.

Don’t forget to share!

Related posts