opentelemetry-rust
opentelemetry-rust copied to clipboard
Infinite trace generator
When I use the opentelemetry_datadog library with reqwest as the client, each batch of spans will generate an additional 2 spans. These spans look like this:
[[{'service': 'test', 'name': 'opentelemetry-datadog', 'resource': 'encode_headers', 'trace_id': 9313706354471130378, 'span_id': 2181458161362850175, 'parent_id': 0, 'start': 1647298645595811251, 'duration': 156890, 'error': 0, 'meta': {'busy_ns': '66163', 'code.namespace': 'hyper::proto::h1::role', 'code.filepath': '/home/hamiltop/.cargo/registry/src/github.com-1ecc6299db9ec823/hyper-0.14.17/src/proto/h1/role.rs', 'code.lineno': '1061', 'idle_ns': '90678'}}], [{'service': 'test', 'name': 'opentelemetry-datadog', 'resource': 'parse_headers', 'trace_id': 11785853278508097401, 'span_id': 9996945950193117364, 'parent_id': 0, 'start': 1647298645596617264, 'duration': 159966, 'error': 0, 'meta': {'code.namespace': 'hyper::proto::h1::role', 'code.lineno': '1183', 'busy_ns': '119852', 'code.filepath': '/home/hamiltop/.cargo/registry/src/github.com-1ecc6299db9ec823/hyper-0.14.17/src/proto/h1/role.rs', 'idle_ns': '36137'}}]]
I believe this is from the http request to datadog to save the original spans. Now, the request to send these 2 spans also generates 2 new spans, which generate 2 new spans, etc.
To reproduce:
- Run this script as a mock datadog agent: https://gist.github.com/kongo2002/a4d560dca89e24bf27c8d59dc63f4b1e This will echo all incoming traces to stdout for easy inspecting.
- Run this code:
use opentelemetry_datadog::{new_pipeline, ApiVersion};
use tracing::trace_span;
use tracing_subscriber::layer::SubscriberExt;
#[tokio::main]
async fn main() {
let _tracer = new_pipeline()
.with_agent_endpoint("http://localhost:8126")
.with_service_name("test")
.with_version(ApiVersion::Version03)
.install_batch(opentelemetry::runtime::TokioCurrentThread)
.unwrap();
let telemetry = tracing_opentelemetry::layer().with_tracer(_tracer);
let subscriber = tracing_subscriber::Registry::default().with(telemetry);
tracing::subscriber::set_global_default(subscriber)
.expect("Failed to install `tracing` subscriber.");
trace_span!({ "test" }, "test span").in_scope(|| {
println!("Hello, world!");
});
tokio::time::sleep(tokio::time::Duration::from_secs(60)).await;
opentelemetry::global::shutdown_tracer_provider();
}
- Watch output in script (1) to see the
encode_headers/parse_headerstraces get generated over and over again.
I would expect to have some way to disable tracing of the tracer. Is this a bug? Do I just need to configure trace levels more precisely?
I think the answer here is that you're seeing a lot of spam in the output because you haven't included any filtering in the registry stack. The default behavior is to include everything, even items that are tagged with trace (here meaning the logging level that's one step more verbose than debug, not to be confused with service traces).
In this case the hyper crate being used for transport itself uses the tracing facade and can produce events. This might be equivalent to configuring Java logging for the Apache HTTP package to spit out logs for all headers.
A typical application will have some filtering mechanism baked into the registry and would normally pick INFO or WARN as the default logging level rather than TRACE, and only increase logging as needed for specific packages. See below for an example which would respect the RUST_LOG environment variable as specified by the logging crate, or if none exists default to excluding all output below the level of INFO
let subscriber = tracing_subscriber::Registry::default().with(telemetry)
.with(tracing_subscriber::EnvFilter::try_from_default_env().unwrap_or_else(|_| EnvFilter::new("info")))
Just found this issue, before i opened https://github.com/open-telemetry/opentelemetry-rust/issues/1171. Same solution should work for both, I think..
Sounds like this might be related to https://github.com/open-telemetry/opentelemetry-rust/issues/473
There was some attempt made to fix this as part of #1330, if someone wants to have a look further. Otherwise, I will get back to this issue shortly. The fix was to suppress the logs invoked in the context of the (async) task doing the export.
@jherico, filtering is one way (in fact, what I'm having to do right now, eg: h2=off to stop this problem occurring), but that means I miss out on h2 traces from my application.
Hopefully something like #1330 can fix this properly?