Stop Logging the Request Body!

Logging

By: Martin Thwaites | February 10th, 2025

Logging

5 Min. Read

With more and more people adopting OpenTelemetry and specifically using the tracing signal, I’ve seen an uptick in people wanting to add the entire request and response body as an attribute. This isn’t ideal, as it wasn’t when people were logging the body as text logs. In this blog post, I’ll explain why this is a bad idea, what are the pitfalls, and more importantly, what you should do instead.

Choosing the right signal for the job

The first thing I want to address is that traces (or spans, as the data object) don’t have to be the only signal you use when understanding your system. Telemetry signals are designed (in OpenTelemetry) to be correlated; your job is to choose the right signal for the job. For example:

If it’s about understanding the system, traces/spans are likely the best signal since they have more context that’s then queryable and graphable.
If you know what visualization you want to create ahead of time (like a count of requests by http route template), then metrics might be a better option.
If you have no calling context, wide structured logs would be better.

However, all of these signals have something called resource context, which allows you to correlate the different signals around the thing that served them. Additionally, for logs and traces, there’s a further correlation you can use in traceId and spanId.

There is one big advantage of storing all your data in a single signal type: arbitrary investigation—or, put simply, debugging production. When all your data is in a single signal and that signal is a single event, you can do way more interesting investigations to find correlations and causations in order to ultimately find anomalies.

We’ll keep that in mind as we talk about intentionality in building our software.

In my experience, logging the request body is basically a catch-all. It’s used as a way to be able to access the data from the request without having to do any work in the code to understand what the system is working with. It’s a way of saying, “Look, you have the data, go work it out.”

See the power of Honeycomb and OpenTelemetry.

TRY NOW

Why is it bad?

In my opinion, there are three reasons this is a bad idea. I’ll caveat this with the fact that it’s possible, just mostly inefficient.

More data than you need

You’re likely providing way too much data than you actually need. Allocating that data in memory twice (once for the application and once to send it onwards) is inefficient from both an overall memory perspective and from a response time/latency perspective. This could be seen as necessary overhead, but there are better ways to achieve that goal.

Sending personal data

You’ll likely end up sending data that shouldn’t be persisted anywhere. The main example of this is Personally Identifiable Information (PII). As soon as you start adding the full request (or response) body, you have no ability to control whether your observability backend is in scope for GDPR, CPRA, etc. Worse still, you could reveal and store sensitive data such as plain text passwords from POST data to forms, or financial data. Even if you limit this to certain inputs, you’re one misconfiguration away from storing all that data.

You may come to the conclusion that the answer is to restrict access to that observability backend. However, that simply creates a bottleneck for resolving issues since only a few users will now have access. You’ll also still have the issue that now, your observability backend is in scope for things like PCI-DSS if you take payments.

It’s always about money

Think about the cost multiplier Charity spoke about in her cost crisis piece. This comes in multiple forms, from the cost of storage in your observability backend to egress costs—and a potential increase in compute costs caused by the issues outlined in the first point.

So, what should we do?

Observability is about understanding and answering questions about how your production system is functioning. The important thing to consider with this is “who” can ask those questions, from both an access side (do they even have access to the telemetry data) and from an ability side. Labeling the data, and working on understanding that whole team has, is crucial.

So what does that mean in practice? It means intentionally looking at your code and adding attributes with explicit names when they’re useful and risk-free.

Add extension methods, helper classes that allow you to take your request and response body, but explicitly extract just the data that is important while giving it names. Writing the code at this level allows you to provide your knowledge of the system at the time you’re writing it, to your future debugging self. You can provide everything from consistent naming to filtering the available data to ensure you’re not exposing too much. All of this will help you (and anyone else) who needs to debug the system in the future.

Don’t forget to share!

Martin Thwaites

Principal Developer Advocate

Martin is a Developer Advocate at Honeycomb, o11y enthusiast, and a delivery-focused Developer from the UK. With over 20 years experience in development in the .NET ecosystem, he’s worked with many companies on scaling up engineering teams and products. The past few years have been spent working on solving complex problems with some of the UK’s big names, including e-commerce retailers and credit lenders.

Rox Williams | Dec 30, 2024

The Log Monitoring Guide for Sweet Insights

Logs are more than just records. With proper log monitoring, they become the honey that sweetens observability. Observability is your ability to understand and optimize your system’s behavior. Turning raw logs into actionable insights requires the right tools, practices, and insights. This blog post is a guide on log monitoring key concepts and best practices for sweetening your observability.

Logging

Alex Boten | Dec 04, 2024

Fewer Logs, More Value

We’re always interested in improving the signal-to-noise ratio of our internal telemetry at Honeycomb. In an effort to reduce the amount of noise in our logs, we looked at reducing and deduplicating the logs emitted by our infrastructure and applications.

Logging OpenTelemetry

Martin Thwaites | Nov 26, 2024

Ingesting JSON Logs From Containers With the OpenTelemetry Collector

So, how do we get JSON logs into a backend analysis system like Honeycomb that primarily accepts OTLP data? In this post, we’ll cover how to use the filelog receiver component in the OpenTelemetry Collector to parse JSON log lines from logs files, as there are a few ways to achieve this.

Logging OpenTelemetry

All-in-one Observability

Why Honeycomb

Looking for something?

Our mission

Stop Logging the Request Body!

The Bridge From Observability 1.0 to Observability 2.0 Is Made Up of Logs, Not Metrics

Choosing the right signal for the job

Why is it bad?

More data than you need

Sending personal data

It’s always about money

So, what should we do?

Martin Thwaites

Related posts

The Log Monitoring Guide for Sweet Insights

Fewer Logs, More Value

Ingesting JSON Logs From Containers With the OpenTelemetry Collector

Ready to get started?