opentelemetry-rust icon indicating copy to clipboard operation
opentelemetry-rust copied to clipboard

Tracing is not working for google stackdriver

Open cqx820 opened this issue 4 years ago • 24 comments

Hi,

I'm trying to send trace data to Google Cloud Trace, I tried a python script with a credential file and it's working for me, but I'm unable to do the same with rust. The Cargo.toml looks like

opentelemetry = { version = "0.16.0", default_features = false, features = ["rt-tokio", "rt-async-std", "rt-tokio-current-thread", "trace", "metrics"] }
opentelemetry-stackdriver = { version = "0.12.0", default_features = false, features = ["gcp_auth", "yup-authorizer"] }
tracing = "0.1.28"
tracing-futures = "0.2.5"
tracing-opentelemetry = "0.15.0"
tracing-subscriber = "0.2.24"

My workflow is like this:

  1. Get exporter
let handle = tokio::runtime::Handle::current();
let spawner = tokio_adapter::TokioSpawner::new(handle);
let exporter = StackDriverExporter::connect(
    GcpAuthorizer::new().await.unwrap(), &spawner, None, None).await.unwrap();  // Tried YupAuthorizer as well, didn't work too.
  1. Get tracer
let tracer = sdk::trace::TracerProvider::builder()
      .with_simple_exporter(exporter)
      .build()
      .tracer("tracer_name", Some(VERSION));
  1. Get subscriber
let subscriber = tracing_subscriber::registry().with(tracing_opentelemetry::layer().with_tracer(tracer));
  1. Set global default
tracing::subscriber::set_global_default(subscriber);
  1. Call tracing::span!(tracing::Level::TRACE, span_name, ...)

Above is my entire workflow, after running I can't find any traces on Google Cloud Trace, I'm sure the cred file has enough permission. I've also tried raising the trace levels but no lucks. There were no runtime errors found. I wonder if I'm doing wrong with getting the exporter and the tracer.

I also wrote some sample codes below just for testing, but they didn't work for me too

let exporter = StackDriverExporter::connect(
    GcpAuthorizer::new().await.unwrap(),
    &spawner, Some(Duration::from_secs(5)), Some(5usize),
  ).await.unwrap();

let tracer = sdk::trace::TracerProvider::builder()
      .with_simple_exporter(exporter)
      .build()
      .tracer("tracer_name", Some(VERSION));

let attributes =
      vec![KeyValue::new("version", "0.0.0")];

let _span = tracer
        .span_builder("my_test_span")
        .with_attributes(attributes)
        .start(&tracer);

tracer.with_span(_span, |cx| {
      sleep(Duration::from_millis(500));
    });

Any suggestions would be appreciated, thank you very much!!

cqx820 avatar Sep 22 '21 23:09 cqx820

Could you try adding opentelemetry::global::shutdown_tracer_provider() at the end of your main function?

TommyCpp avatar Sep 23 '21 01:09 TommyCpp

Could you try adding opentelemetry::global::shutdown_tracer_provider() at the end of your main function?

Thanks for your reply sir, I tried your suggestion but unfortunately it didn't work, I was still not seeing anything on Google Trace. I also found another issue about the YupAuthorizer, I provided a path for persistent_token_file, but I found it's not persisting the token to disk too. I was wondering if the authorization failed, but I didn't see any panic so I was assuming the auth worked.

cqx820 avatar Sep 23 '21 02:09 cqx820

We set it up like this:

        let authorizer = GcpAuthorizer::new().await.unwrap();
        let spawner = TokioSpawner::new(Handle::current());
        let exporter = StackDriverExporter::connect(authorizer, &spawner, None, None)
            .await
            .unwrap();
        let provider = TracerProvider::builder()
            .with_batch_exporter(exporter, Tokio)
            .with_config(Config {
                sampler: Box::new(Sampler::TraceIdRatioBased(1.0)),
                ..Default::default()
            })
            .build();

        tracing_subscriber::registry()
            .with(tracing_opentelemetry::layer().with_tracer(provider.tracer("tracing", None)))
            .try_init()
            .unwrap();

This has been proven to work quite recently.

djc avatar Sep 23 '21 13:09 djc

We set it up like this:

        let authorizer = GcpAuthorizer::new().await.unwrap();
        let spawner = TokioSpawner::new(Handle::current());
        let exporter = StackDriverExporter::connect(authorizer, &spawner, None, None)
            .await
            .unwrap();
        let provider = TracerProvider::builder()
            .with_batch_exporter(exporter, Tokio)
            .with_config(Config {
                sampler: Box::new(Sampler::TraceIdRatioBased(1.0)),
                ..Default::default()
            })
            .build();

        tracing_subscriber::registry()
            .with(tracing_opentelemetry::layer().with_tracer(provider.tracer("tracing", None)))
            .try_init()
            .unwrap();

This has been proven to work quite recently.

Thank you very much for your reply, I really appreciate your help. I tried your setup but I was still not able to see any traces. I'm sure the environment variable was properly set and the credential file was valid too. What did I miss?


#[tokio::main]
async fn main() {
    let name = "GOOGLE_APPLICATION_CREDENTIALS";
    match env::var(name) {
      Ok(v) => println!("{}: {}", name, v),
      Err(e) => panic!("${} is not set ({})", name, e)
    }
    init_tracing().await.unwrap();
    let span = tracing::trace_span!("my_span_test", version="0.0.0", uuid="0000");
    span.in_scope(|| {
      println!("Hello, world!");
      sleep(Duration::from_millis(500));
    });
    opentelemetry::global::shutdown_tracer_provider();
}

async fn init_tracing() -> Result<(), anyhow::Error> {
  let authorizer = GcpAuthorizer::new().await.unwrap();
  let spawner = TokioSpawner::new(tokio::runtime::Handle::current());
  let exporter = StackDriverExporter::connect(
    authorizer, &spawner, None, None,
  ).await.unwrap();
  let provider = sdk::trace::TracerProvider::builder()
    .with_batch_exporter(exporter, opentelemetry::runtime::Tokio)
    .with_config(Config {
      sampler: Box::new(Sampler::TraceIdRatioBased(1.0)),
      ..Default::default()
    })
    .build();
  tracing_subscriber::registry()
    .with(tracing_opentelemetry::layer().with_tracer(provider.tracer("tracing", None))).try_init()
    .unwrap();
  Ok(())
}

cqx820 avatar Sep 23 '21 15:09 cqx820

There are a few log::error() calls in the exporter. Maybe set up a simple logger (env_logger usually suffices) and see if any of the existing log calls trigger?

djc avatar Sep 29 '21 06:09 djc

I am having the same problem. No traces show up on the Google Cloud Trace panel. My init_telemetry function looks like this

async fn init_telemetry() {
    let app_name = "app";
    // Spans are exported in batch - recommended setup for a production application.
    global::set_text_map_propagator(TraceContextPropagator::new());
    let spawner = TokioSpawner::new(tokio::runtime::Handle::current());
    let authorizer = opentelemetry_stackdriver::GcpAuthorizer::new()
        .await
        .unwrap();
    let exporter = opentelemetry_stackdriver::StackDriverExporter::connect(
        authorizer,
        &spawner,
        None,
        Some(4),
    )
    .await
    .unwrap();
    let provider = TProvider::builder()
        .with_batch_exporter(exporter, opentelemetry::runtime::Tokio)
        .with_config(Config {
            sampler: Box::new(Sampler::TraceIdRatioBased(1.0)),
            ..Default::default()
        })
        .build();

    // Filter based on level - trace, debug, info, warn, error
    // Tunable via `RUST_LOG` env variable
    let env_filter = EnvFilter::try_from_default_env().unwrap_or(EnvFilter::new("debug"));
    // Create a `tracing` layer using the Jaeger tracer
    // let telemetry = tracing_opentelemetry::layer().with_tracer(tracer);
    // let telemetry = tracing_opentelemetry::layer().with_tracer(tracer);
    let telemetry = tracing_opentelemetry::layer().with_tracer(provider.tracer("tracing", None));
    // Create a `tracing` layer to emit spans as structured logs to stdout
    let formatting_layer = BunyanFormattingLayer::new(app_name.into(), std::io::stdout);
    // Combined them all together in a `tracing` subscriber
    tracing_subscriber::registry()
        .with(env_filter)
        .with(telemetry)
        .with(JsonStorageLayer)
        .with(formatting_layer)
        .try_init()
        .unwrap();
    // tracing::subscriber::set_global_default(subscriber)
    //     .expect("Failed to install `tracing` subscriber.")
}

the same setup works with jaeger. I am suspecting of my TokioSpawner struct which looks like this

struct TokioSpawner {
    handle: tokio::runtime::Handle,
}

impl TokioSpawner {
    fn new(handle: tokio::runtime::Handle) -> Self {
        Self { handle }
    }
}

impl Spawn for TokioSpawner {
    fn spawn_obj(&self, future: FutureObj<'static, ()>) -> std::result::Result<(), SpawnError> {
        self.handle.spawn(future);

        Ok(())
    }

    fn status(&self) -> std::result::Result<(), SpawnError> {
        Ok(())
    }
}

dvtkrlbs avatar Jan 18 '22 18:01 dvtkrlbs

You don't need to define the TokioSpawner -- opentelemetry-stackdriver provides it, gated by an optional feature.

Did you already try my hint about enabling logging? You'll want to set the tracing-subscriber features to default-features = false to make sure it doesn't catch logging stuff.

You'll also need to give the associated service account permissions to write traces. Anyway, I've been testing opentelementry-stackdriver myself for the past week or so, and I can definitely get it to work.

Current test setup:

    env_logger::init();
    let authorizer = GcpAuthorizer::new().await.unwrap();
    let spawner = TokioSpawner::new(Handle::current());
    let exporter = StackDriverExporter::connect(authorizer, &spawner, None, None)
        .await
        .unwrap();
    let provider = TracerProvider::builder()
        .with_batch_exporter(exporter.clone(), Tokio)
        .with_config(Config {
            sampler: Box::new(Sampler::TraceIdRatioBased(1.0)),
            ..Default::default()
        })
        .build();

    tracing_subscriber::registry()
        .with(tracing_opentelemetry::layer().with_tracer(provider.tracer("tracing", None)))
        .try_init()
        .unwrap();

djc avatar Jan 19 '22 08:01 djc

There is no errors printed. It just prints bunch of zeros without any other information. The service account I am using have admin access to the trace api.

dvtkrlbs avatar Jan 19 '22 15:01 dvtkrlbs

Well, I'm confident the code above works, because I've recently tested it. If it doesn't work, I suspect it's something about your environment. You could clone this project, change opentelemetry-stackdriver's dependency on opentelemetry to skip the path dependency (and fix up the occurrences of to_bytes() -> to_byte_array()), then add some dbg/println statements to see what's going on.

djc avatar Jan 19 '22 15:01 djc

Hi!

Is there any documentation/example on how to use the most recent version of this library? The above code snippets seem to be a bit outdated with the most recent code not having a TokioSpawner and the introduction of a builder that returns a tuple containing a future :)

etiescu avatar Apr 03 '22 19:04 etiescu

Here's some current code from the same test project:

    env_logger::init();

    let authentication_manager = AuthenticationManager::new().await.unwrap();
    let project_id = authentication_manager.project_id().await.unwrap();
    let log_context = LogContext {
        log_id: "cloud-trace-test".into(),
        resource: MonitoredResource::GenericNode {
            project_id,
            namespace: Some("test".to_owned()),
            location: None,
            node_id: None,
        },
    };

    let authorizer = GcpAuthorizer::new().await.unwrap();
    let (exporter, driver) = StackDriverExporter::builder()
        .log_context(log_context)
        .build(authorizer)
        .await
        .unwrap();

    tokio::spawn(driver);
    let provider = TracerProvider::builder()
        .with_batch_exporter(exporter.clone(), Tokio)
        .with_config(Config {
            sampler: Box::new(Sampler::TraceIdRatioBased(CLOUD_TRACE_RATE)),
            ..Default::default()
        })
        .build();

    tracing_subscriber::registry()
        .with(tracing_opentelemetry::layer().with_tracer(provider.tracer("tracing")))
        .try_init()
        .unwrap();

djc avatar Apr 04 '22 11:04 djc

I'm also having this problem. My code is essentially identical to @djc's latest snippet, and I get the following output:

OpenTelemetry trace error occurred. Exporter stackdriver encountered the following error(s): authorizer error: authorizer error: Could not establish connection with OAuth server
OpenTelemetry trace error occurred. Exporter stackdriver encountered the following error(s): authorizer error: Could not establish connection with OAuth server

I've confirmed using curl manually that I can successfully fetch the bearer token from https://oauth2.googleapis.com/token so I don't think it's anything wrong with my service_account json.

richardbrodie avatar Apr 19 '22 12:04 richardbrodie

Try using gcp_auth to get a token for the correct authz scope without opentelemetry-stackdriver, does that work?

https://github.com/open-telemetry/opentelemetry-rust/blob/main/opentelemetry-stackdriver/src/lib.rs#L191

djc avatar Apr 19 '22 12:04 djc

Yep, that works fine. Using scopes=["https://www.googleapis.com/auth/trace.append", "https://www.googleapis.com/auth/logging.write"] I successfully get a Token:

Token { 
    access_token: "****", 
    expires_at: Some(OffsetDateTime { 
        utc_datetime: PrimitiveDateTime { 
            date: Date { year: 2022, ordinal: 109 }, 
            time: Time { hour: 12, minute: 43, second: 58, nanosecond: 457575700 } 
        }, 
        offset: UtcOffset { hours: 0, minutes: 0, seconds: 0 } 
    }) 
}

Assuming you meant gcp_auth::AuthenticationManager::get_token(scopes) right?

richardbrodie avatar Apr 19 '22 12:04 richardbrodie

Yes, that's what I meant.

Did you configure either of the TLS roots Cargo features for opentelemetry-stackdriver?

djc avatar Apr 19 '22 12:04 djc

I have this in my Cargo.toml opentelemetry-stackdriver = { version = "0.14.0", default_features = false, features = ["tls-native-roots", "gcp_auth"] }

richardbrodie avatar Apr 19 '22 12:04 richardbrodie

If you run with debug-level logging enabled, which kind of authentication manager is selected?

djc avatar Apr 19 '22 13:04 djc

It's tracing not logging, but I get a couple of these: 2022-04-19T13:04:26.227593Z DEBUG refresh_token{self=CustomServiceAccount { credentials: ApplicationCredentials { type: Some("service_account"), project_id: Some(...), private_key_id: Some(...), private_key: ..., client_email: ..., client_id: Some(...), auth_uri: Some("https://accounts.google.com/o/oauth2/auth"), token_uri: "https://oauth2.googleapis.com/token", auth_provider_x509_cert_url: Some("https://www.googleapis.com/oauth2/v1/certs"), client_x509_cert_url: Some("https://www.googleapis.com/robot/v1/metadata/x509/...") }, signer: Signer, tokens: RwLock { data: {}, poisoned: false, .. } } client=Client scopes=["https://www.googleapis.com/auth/trace.append", "https://www.googleapis.com/auth/logging.write"]}: gcp_auth::custom_service_account: requesting token from service account: Request { method: POST, uri: https://oauth2.googleapis.com/token, version: HTTP/1.1, headers: {"content-type": "application/x-www-form-urlencoded"}, body: Body(Full(b"grant_type=urn%3Aietf%3Aparams%3Aoauth%3Agrant-type%3Ajwt-bearer&assertion=...")) }

richardbrodie avatar Apr 19 '22 13:04 richardbrodie

That seems okay. I'm afraid you'll have to debug for yourself where stuff is going wrong inside opentelemetry-stackdriver. There's not a lot of code there, I'd recommend cloning a local git repo and patching your test setup to use the local version and add some debug statements there.

djc avatar Apr 19 '22 13:04 djc

Thought so, but it's good to get a pair of fresh eyes on it before I start. Thanks for the help :)

richardbrodie avatar Apr 19 '22 13:04 richardbrodie

Update: after a lot of digging and debugging I must embarrassingly report that my issue was simply that in my test program the tokio runtime was being terminated before the gcp_auth thread requesting a new token was finished. I added a 1000ms sleep just before opentelemetry::global::shutdown_tracer_provider() and it works properly now.

I now get tonic error: status: PermissionDenied, message: "The caller does not have permission" but that seems out of the scope of this crate :)

richardbrodie avatar Apr 20 '22 11:04 richardbrodie

You'll want some of these:

cloudtrace.traces.patch logging.logEntries.create monitoring.timeSeries.create

djc avatar Apr 20 '22 12:04 djc

Screenshot from 2022-09-19 21-02-47 Screenshot from 2022-09-19 21-03-30 Hi please if possible Show a working example at the moment, I tried everything that is here, I can not compile. In the last example from April 4th, it is not possible to get trace and provider Thanks.

AlexandrBurak avatar Sep 19 '22 17:09 AlexandrBurak

It sounds like you might have a version mismatch going on between some of the crates you're using.

djc avatar Sep 20 '22 09:09 djc