Rescue Struggling Pods from Scratch

Rescue Struggling Pods from Scratch

9 Min. Read

Containers are an amazing technology. They provide huge benefits and create useful constraints for distributing software. Golang-based software doesn’t need a container in the same way Ruby or Python would bundle the runtime and dependencies. For a statically compiled Go application, the container doesn’t need much beyond the binary. 

Since the software is intended to run in a Kubernetes cluster, the container provides the release and distribution mechanism which the Helm chart uses to refer to these binaries. It also allows releasing multiple processor architectures to reference their own images. For general troubleshooting, some pretty good resources exist, like Refinery and the OpenTelemetry Collector.

One type of troubleshooting is unfortunately absent from distroless containers. The kind that requires getting into a shell.

% kubectl exec -it -n otel opentelemetry-collector-56469989d-q74cg -- /bin/sh
error: Internal error occurred: error executing command in container: failed to exec in container: failed to start exec "07c8620b9e9707ac9d4be9695c527cf02a8374fff4be52be3bb04db3fd73af05": OCI runtime exec failed: exec failed: unable to start container process: exec: "/bin/sh": stat /bin/sh: no such file or directory: unknown

You can’t run /bin/sh or /bin/bash or any other shell because they’re not in the image. 

Images FROM scratch

To reduce risk exposure of the released binaries, both Refinery and the OpenTelemetry Collector use the scratch image base. The scratch image has absolutely nothing in it. This means no security issues from bundled dependencies or other operating system components.

The container image build step during release does get fresh certificate authority certificates, but nothing else. This is generally seen as the right approach for releasing a statically compiled binary because of the very low risk.

Troubleshooting a struggling Collector pod

Sometimes, a Kubernetes pod (or any software) will struggle to get started or continue running. We often need to answer questions like:

  • Can the container see DNS and make other network connections?
  • Are the configmap and secrets mounted and properly formatted?
  • Does the binary throw any errors that aren’t showing in the pod logs?
  • Are file permissions set right?
  • Does the pod’s security context block anything needed to run?

A lot of these questions can be answered by checking logs, errors, and Kubernetes events. Sometimes, it requires tweaking the deployment or pod spec to see if that change has an effect, and then passively monitoring for new errors. 

These passive signals can be slow. It’s also difficult to know whether a change introduced confounding issues (or other side effects) when rescheduling. Maybe it only works on one node and not the other, so every time a configuration change happens—whether it’s good or bad—the pod starts or stops working based on the scheduler’s decisions, rather than the configuration itself. 

One way to get to the bottom of the answer quickly is to shell into the container inside the failing pod and see what’s happening!

Shelling into a distroless pod

It doesn’t work. There are no shells in the container.

unable to start container process: exec: "/bin/sh": stat /bin/sh: no such file or directory: unknown

Here are a few approaches people may take in this situation:

  • Add a sidecar container to the pod which can be used for troubleshooting
    1. The easy way: if you have a 1.25 (or later) cluster, use the kubectl debug command
      • kubectl debug -n otel -it opentelemetry-collector-56469989d-q74cg --image=busybox:1.28 --target=opentelemetry-collector
      • Can see the process, but not the config file
        1. There’s a workaround and open issue to resolve this 
    2. The hard way: modify the deployment to add the sidecar with mounts
      • Can validate some of the pod spec and some connectivity concerns
      • Doesn’t validate container configurations, such as container SecurityContext
      • Doesn’t validate other node-level items that are still hard to answer
      • Significant change to pod spec
    3. Either way: Can’t run the otel-collector-contrib binary from the other container
  • Run a completely different container in its place for troubleshooting
    1. Can test connectivity, mounted configurations, and environment variables
    2. Can’t run the otel-collector-contrib binary from the other container
    3. Can’t see if there’s a filesystem conflict with the container
    4. Mild change to deployment spec
      • Image and command need to be changed
  • Wrap the existing image contents with a thicker image
    1. Image includes troubleshooting commands and the binary from the latest release
    2. Shows all mounts, configurations, security contexts, etc
    3. Minimal deployment spec change
      • Just the image
      • Change command to sleep 20000 if container won’t start

All suggestions listed above are temporary since they reintroduce the risk that was avoided by using a distroless image.

The debug, sidecar, or replacement image has quite a few options out in the world for various troubleshooting. I’d suggest building your own with the tools you need if you plan to take this route. The pre-bundled troubleshooting images typically run as root and may have old or dangerous applications, or even scripts which you don’t want in your cluster.

Thickening the application image

In container image terminology, a slim image has less in it than the default image. In this case, we want to go the other way and add to a scratch (slimmest possible) image.

To do this, you can use a Dockerfile (example below) to take two containers and pull the stuff from one into the other. 

The idea is to use Docker’s multi-step build with these steps: 

  1. Bring in the release image, which is just the binary
  2. Copy the binary into the ubuntu:latest image
  3. Use original image’s paths and environment variables

OpenTelemetry Collector thick image

This example takes the latest OpenTelemetry Collector contrib image and pulls the contents into an ubuntu:latest image. Check tags for alternative processor architectures if you need them.

FROM otel/opentelemetry-collector-contrib:latest as binary
FROM ubuntu:latest

ARG USER_UID=10001
USER ${USER_UID}

COPY --from=binary /* /

EXPOSE 4317 55680 55679
ENTRYPOINT ["/otelcol-contrib"]
CMD ["--config", "/etc/otel/config.yaml"]

Refinery thick image

To create a Refinery image for troubleshooting, you can follow a similar pattern:

FROM honeycombio/refinery:2.1.0 as binary
FROM ubuntu:latest

COPY --from=binary /ko-app/refinery /ko-app/refinery

EXPOSE 4317 8080 8081

ENTRYPOINT ["refinery", "-c", "/etc/refinery/config.yaml", "-r", "/etc/refinery/rules.yaml" ]
ENV PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/ko-app \
    SSL_CERT_FILE=/etc/ssl/certs/ca-certificates.crt \
    KO_DATA_PATH=/var/run/ko

VOLUME /etc/refinery/config.yaml /etc/refinery/rules.yaml

Note that the Refinery GitHub repository’s Dockerfile doesn’t match how the image is built. You can use docker cp to copy the file contents of the image into a temporary directory to find the binary if it moves. The /ko-app directory and PATH are set in the 2.1.0 release, so I put them in the thick image example. This way, it can act as a drop-in replacement.

Building the thick image

Once you’ve created your Dockerfile, create the image and tag it by running one of these commands:

docker build -f Dockerfile.ubuntu -t otelcolcontrib:thick .
docker build -f Dockerfile.ubuntu -t refinery:thick .

If you want to use a different, thicker image, change the FROM lines. Your options are centos, arch, debian, busybox, and plenty of others.

You can test all this locally by creating configuration files and starting the containers with docker run:

docker run -it -v $PWD/otel-collector-defaults.yaml:/etc/otel/config.yaml --entrypoint /otelcol-contrib otelcolcontrib:thick --config /etc/otel/config.yaml
docker run -it -v $(pwd)/refinery-config.yaml:etc/refinery/config.yaml -v $(pwd)/refinery-rules.yaml:/etc/refinery/rules.yaml refinery:thick

The real use is to push the image to a container image registry, then reference it in your Kubernetes deployment.

docker tag otelcolcontrib:thick my-local-registry/otel/opentelemetry-collector-contrib
docker push my-local-registry/otel/opentelemetry-collector-contrib

Helm values

Make a branch for your infrastructure-as-code repo that can be discarded once troubleshooting is done. If you need the main branch merge to be able to push these changes to the cluster, create a Git tag so you have a reference point to know where the configuration diverges for troubleshooting. 

Then, you can make adjustments to the OpenTelemetry Collector values.yaml file, such as:

image:
  repository: my-local-registry/otel/opentelemetry-collector-contrib
  pullPolicy: Always
  tag: "thick"

If the new image won’t start because of an invalid configuration and the container immediately exits, you won’t be able to shell in. To fix this, you can override the command:

command:
  name: sleep
  extraArgs: 
    - 30m

For Refinery, the same image block can be used, but with a reference to the Refinery image rather than the Collector in the example. The chart doesn’t expose the command as a value, so you’ll need to do kubectl edit deploy -n refinery refinery and modify the command in there to say sleep. The arguments should be updated to say 30m.

If you’re using your own image repository and a tag other than latest, be sure to set the pull policy to say imagePullPolicy: Always so it will pick up any changes made to the image as you add more tools or change your image.

After figuring out the issue and fixing it, either set the image back, revert to the tag (but preserve the fix), or discard the branch and fix main.

For future visitors

Check into the kubectl debug documentation to see if it’s become more capable. As of 1.27, it can get close to the capabilities of a sidecar, but you have to manually patch in the volumes and it’s not as clean and predictable as opening a shell in the same pod. 

As Kubernetes gets better, the tools to support it also get better.In fact, you can use Honeycomb to keep an eye on Kubernetes. See how we handle crashlooping pods with our own product.

Don’t forget to share!
Mike Terhar

Mike Terhar

Senior Customer Architect

Mike enjoys solving problems and has made a career chasing that reward trigger. By being a generalist and working with many industries, Mike brings empathy for an organization’s struggles and blends their old ways with newer, better ways of working. By focusing on learned helplessness and reinterpretation of policies from first principles, legacy companies can take huge strides forward. Outside of work, funny things are the best things. Mike is always looking for shows, movies, books, plays, pet, and kid activities that bring amusement.

Related posts