When you want to direct your observability data in a uniform fashion, you want to run an OpenTelemetry collector. If you have a Kubernetes cluster handy, that’s a useful place to run it. Helm is a quick way to get it running in Kubernetes; it encapsulates all the YAML object definitions that you need. OpenTelemetry publishes a Helm chart for the collector.
When you install the OpenTelemetry collector with Helm, you’ll give it some configuration. Override the defaults, add your specific needs. To get your configuration right, play around in a test Kubernetes cluster until the collector works the way you want.
This post details how to start with a default installation of the OpenTelemetry Collector, and iterate the configuration until it works the way you want it to.
Prerequisites:
Do this once: get the Helm chart working
Get access to the official Helm chart (as instructed in the README):
helm repo add open-telemetry https://open-telemetry.github.io/opentelemetry-helm-charts
You need a name for the installation. I’m using “collectron.” It wants a lowercase RFC 1123 subdomain so stick with lowercase letters, numbers, and dashes. If the name doesn’t include “opentelemetry-collector” the chart will append that for you.
Now, try installing for the first time:
helm install collectron open-telemetry/opentelemetry-collector
If you see an error like this:
Error: execution error at (opentelemetry-collector/templates/NOTES.txt:14:3): [ERROR] 'mode' must be set. See https://github.com/open-telemetry/opentelemetry-helm-charts/blob/main/charts/opentelemetry-collector/UPGRADING.md for instructions.
That’s a good sign. We need to give it some configuration.
Create a file to contain the chart configuration. I’m going to call it `values.yaml
`.
In it, put `mode: deployment
` (if you’re aiming to spin up a collector for front-end traces) or `mode: daemonset
` (if the collector is going to process traces from other applications running in Kubernetes).
values.yaml:
mode: deployment
Having tried once, and having created a values.yaml for configuration, commence iteration.
Iterate on the Helm chart
Change values.yaml and save the file.
Update the installation:
helm upgrade collectron open-telemetry/opentelemetry-collector --values values.yaml
Check that exactly one is running:
kubectl get pods
Tail its log:
kubectl get pods -o name | grep collectron | sed 's#pod/##' | xargs kubectl logs -f
Send a test span. If you’re not sure how to do that, here’s a reference.
More explanation
Each time you change values.yaml, update the installation like this:
helm upgrade collectron open-telemetry/opentelemetry-collector --values values.yaml
Successful output looks like:
Release "collectron" has been upgraded. Happy Helming! NAME: collectron LAST DEPLOYED: Fri Jul 8 13:16:07 2022 NAMESPACE: default STATUS: deployed REVISION: 19 TEST SUITE: None NOTES:
Check that the collector is running
We expect Kubernetes to run a pod with a name that starts with the installation name, collectron. The Helm chart appends “opentelemetry-collector” if your name doesn’t already contain this.
Check what’s running with:
kubectl get pods
I see this line:
NAME READY STATUS RESTARTS AGE collectron-opentelemetry-collector-766b88bbf8-gr482 1/1 Running 0 2m18
See the pod
Check that there is exactly one of them. Check the last column (how long ago) to see whether this one started up after your last helm upgrade.
Troubleshooting: My pod didn’t restart after the upgrade.
If your upgrade did not modify the collector config, then maybe it didn’t need to restart the pod. For instance, adding `service: LoadBalancer
` to values.yaml doesn’t need a pod restart.
For everything else: check the output of `helm upgrade
`. Maybe there is an error message.
See the pod status
Check that the status is “Running.”
Troubleshooting: My pod stays in PENDING status forever.
Try:
kubectl describe pod <pod name>
This prints a lot more information, including why the pod is still pending. In my case, the output included:
Warning FailedScheduling 16s (x105 over 105m) default-scheduler 0/2 nodes are available: 2 Insufficient cpu, 2 Insufficient memory
All my nodes were full, so I added another one. Poof! Pod started!
Troubleshooting: My pod status is CrashLoopBackoff.
Something’s going wrong. Use kubectl logs
to find out what.
Check on open ports
By default, a lot of ports are open on the collector container. If extra ports are open, that can confuse health checks and stop a load balancer from seeing your collector pods. You can check on this with:
kubectl describe <pod name>
Here’s a one-liner that will list the open ports with their names:
kubectl get pods -o name | grep opentelemetry-collector | sed 's#pod/##' | xargs kubectl get pod -o jsonpath='{range .spec.containers[].ports[*]}{.containerPort}{"\t"}{.name}{"\n"}{end}
Look at the collector’s logs
The full name of the pod lets you request its logs. Copy that from the output of kubectl get pods and then pass it to kubectl logs (your pod name will be different):
kubectl logs collectron-opentelemetry-collector-766b88bbf8-gr482
Here’s a one-liner that you can repeat after the full name of the pod changes.
kubectl get pods -o name | grep opentelemetry-collector | sed 's#pod/##' | xargs kubectl logs
Hurray, logs! Now we have a feedback loop.
If the startup logs look OK, try sending a test span.
Is the collector doing what you want? If not, change values.yaml and repeat.
What to change next?
All your options are listed in the chart repository’s values.yaml.
You want a `config:
` section in values.yaml. The Helm chart will take what you put here, combine it with its defaults, and produce the collector’s configuration file. You’ll definitely want to define some pipelines and exporters. The chart’s README has some examples.
Note that when you turn off receivers you don’t need, you’ll also want to close the ports on the container. For instance, at the top level of values.yaml:
ports: jaeger-compact: enabled: false
For examples of collectors that process metrics, there are some docs over at Lightstep.
For a full example, here’s the configuration I use to send traces from my client-side apps to Honeycomb.
Get started today
If you’re interested in what Honeycomb has to offer, create a free account today. You get state-of-the-art observability on up to 20 million events (spans) per month—a very useful free tier!
If you want to tell me about your particular experience or if you need more help with this tutorial, sign up for office hours. I’ll happily spend some 1:1 time to hear about your experience and walk you through the OpenTelemetry collector.