Dynamic Sampling by Example

By: Liz Fong-Jones | May 17th, 2019

Instrumentation Sampling Software Engineering

3 Min. Read

Putting it all together: head and tail per-key target rate sampling

If you want to make head sampling automatically instrument everything downstream, make sure you pass the head sampling decision and corresponding rate from parent to child span (e.g. via HTTP header) to force sampling even if dynamic sampling at the lower level’s context would not have chosen to instrument the request.

var headCounts, tailCounts map[interface{}]int
var headSampleRates, tailSampleRates map[interface{}]float64

// Boilerplate main() and goroutine init to overwrite maps and roll them over every interval goes here. checkSampleRate() etc. from above as well

func handler(resp http.ResponseWriter, req *http.Request) {
	var r, upstreamSampleRate float64
	if r, err := floatFromHexBytes(req.Header.Get("Sampling-ID")); err != nil {
		r = rand.Float64()
	}

	// Check if we have an non-negative upstream sample rate; if so, use it.
	if upstreamSampleRate, err := floatFromHexBytes(req.Header.Get("Upstream-Sample-Rate")); err == nil && upstreamSampleRate > 1.0 {
		headSampleRate = upstreamSampleRate
	} else {
		headSampleRate := checkHeadSampleRate(req, headSampleRates, headCounts)
		if headSampleRate > 0 && r < 1.0 / headSampleRate {
			// We'll sample this when recording event below; propagate the decision downstream though.
		} else {
			// clear out headSampleRate as this event didn't qualify for sampling.
			headSampleRate = -1.0
		}
	}

	start := time.Now()
	i, err := callAnotherService(r, headSampleRate)
	resp.Write(i)

	if headSampleRate > 0 {
		RecordEvent(req, headSampleRate, start, err)
	} else {
		// Same as for head sampling, except here we make a tail sampling decision we can't propagate downstream.
		tailSampleRate := checkTailSampleRate(resp, start, err, tailSampleRates, tailCounts)
		if tailSampleRate > 0 && r < 1.0 / tailSampleRate {
			RecordEvent(req, tailSampleRate, start, err)
		}
	}
}

That was complicated, but is extremely powerful for capturing all the necessary context we need to effectively debug our modern, high-throughput systems. There’s even more interesting ways to combine head and tail based trace sampling, such as temporarily increasing the probability of head sampling on the request’s head sampling key if a tail heuristic saw an error in the response.

And, of course, collector-side buffered sampling allows deferring sampling decisions until after an entire trace has been buffered, bringing the advantages of head sampling to properties known at the tail.

Conclusion

Hopefully this practical, iterative set of code examples inspired you to get started with dynamic sampling in your own code. And if you’re interested in overcoming the limitation of per-process sampling decisions and be able to make tail-based sampling decisions based on buffered execution traces, Honeycomb has an upcoming buffered sampling feature. Email solutions@honeycomb.io to request early access.

For more information, read the Honeycomb documentation on sampling, or look at our sample code in Go or JavaScript, and Travis-CI’s Ruby port! Our friends at Cribl have also written a post on dynamic sampling of log data, with no new code needed! Write to me at lizf@honeycomb.io if you have comments or questions!

Don’t forget to share!

Liz Fong-Jones

Field CTO

Liz is a developer advocate, labor and ethics organizer, and Site Reliability Engineer (SRE) with over two decades of experience. She is currently the Field CTO at Honeycomb, and previously was an SRE working on products ranging from the Google Cloud Load Balancer to Google Flights.

Tyler Helmuth | Jan 22, 2025

Tracing Refinery

We recently released Refinery 2.9, which came with great performance improvements. Reading through the release notes, I felt the need to write a piece on this improvement, as it's quite important but easy to overlook: collect loop taking too long. This is the story of how we used distributed tracing to find the slowdown in this loop.

Sampling Tracing

Yingrong Zhao | Dec 10, 2024

Refinery 2.9: A Love Letter to Refinery’s Operators

Refinery is a powerful tail-based sampler—but with great power comes great challenges. We heard your feedback and are excited to announce the release of Refinery 2.9, a rather large update that is packed with goodies to make your life easier when running Refinery in your network.

Sampling

Kent Quirk | Oct 01, 2024

Refinery and EMA Sampling

Refinery is Honeycomb's sampling proxy, which our largest customers use to improve the value they get from their telemetry. It has a variety of interesting samplers to choose from. One category of these is called dynamic sampling. It's basically a technique for adjusting sample rates to account for the volume of incoming data—but doing so in a way that rare events get more priority than common events.

Observability Sampling

All-in-one Observability

Why Honeycomb

Looking for something?

Our mission