opencensus-specs icon indicating copy to clipboard operation
opencensus-specs copied to clipboard

Establish pattern for before-the-fact, trace-scoped sampling

Open codefromthecrypt opened this issue 8 years ago • 2 comments

In B3 (usually zipkin) sample-once, before the fact tracing is status quo. It includes a few things

  • a yes decision: ensures you get the full trace always
  • a no decision: often used for capacity, but sometimes for policy like "don't trace /health"
  • a deferred (null) decision: often used when IDs are pre-provisioned, implies the caller didn't export data yet

There are cases where trace-tier decisions aren't great and are being explored:

  • a proxy by accident or no other option traces everything and you want to re-evaluate
  • a component like SQL has a bug and creates 1000 spans of which you'd like to drop 900 of
  • a service just doesn't want to be traced (never asked this one personally)

Span-scoped decisions can cause problems as they can create possibly unresolvable gaps in a trace, if a yes-no later turns back to a yes

IOTW, I can see cases for both trace tier and span tier decisions. However, brown field really relies on trace-tier (trust a decision downstream), so will be nice to figure a way libraries can facilitate this generically, and safely.

codefromthecrypt avatar Sep 27 '17 11:09 codefromthecrypt

FYI amazon call trusting upstream "pass-through" This is also the same default behavior as spring cloud sleuth http://docs.aws.amazon.com/lambda/latest/dg/lambda-x-ray.html

codefromthecrypt avatar Sep 28 '17 06:09 codefromthecrypt

If we have a good story for rate-limiting (per #9) then we should turn it on by default IMO. Otherwise it's opening yourself up to a DoS unless you have DoS protection that understands tracing (not guaranteed).

semistrict avatar Jan 10 '18 01:01 semistrict