opentelemetry-rust icon indicating copy to clipboard operation
opentelemetry-rust copied to clipboard

[Feature]: Implement error recording on opentelemetry-appender-tracing

Open Yamakaky opened this issue 1 year ago • 6 comments
trafficstars

Related Problems?

No response

Describe the solution you'd like:

Currently, opentelemetry-appender-tracing records errors using their Debug implementation. The cause chain has to be added manually, which is a pain to do at every log point. I believe this would be done in https://github.com/open-telemetry/opentelemetry-rust/blob/main/opentelemetry-appender-tracing/src/layer.rs#L72.

Otel conventions for errors: https://opentelemetry.io/docs/specs/semconv/exceptions/exceptions-logs/ tracing-opentelemetry implementation for reference: https://github.com/tokio-rs/tracing-opentelemetry/blob/v0.1.x/src/layer.rs#L297

Considered Alternatives

No response

Additional Context

No response

Yamakaky avatar Sep 26 '24 09:09 Yamakaky

@Yamakaky That's a fair ask. Would you like to contribute?

lalitb avatar Sep 26 '24 12:09 lalitb

Sure. Some things I'm not sure about:

let error = ...;
error!(error);
error!(error, "some error happened");
error!(error, "some error happened: {}", error);
  • Which option do we recommend?
    • 1 would not have a log message, maybe have error.to_string() as default message value?
    • 3 makes it more verbose.
  • New exception.* fields are added, cf semconv. Do we keep the error field as duplicate? I'd say to remove it
  • What about if there are multiple errors fields, or if exception.* are manually added? I'd say to raise an error.

Yamakaky avatar Sep 26 '24 15:09 Yamakaky

@Yamakaky . Regarding:

let error = ...;
error!(error);
error!(error, "some error happened");
error!(error, "some error happened: {}", error);

I am confused as none of the above you shared are the valid error calls. They won't compile.

I can think of below scenarios:

error!("An error occurred: {}", e); -> Uses Display, not triggering record_error. error!(error = ?e, "An error occurred"); -> Uses Debug, not triggering record_error. error!(error = &e as &dyn Error, "An error occurred"); -> This would trigger record_error (3) error!(error = &e as &dyn Error); -> This would trigger record_error (4)

And in 4th case above, we don't need any default message. Just the error attribute is enough.

Also, we don't need to support semconv for exception for now, as they are still experimental.

lalitb avatar Sep 30 '24 20:09 lalitb

I was doing something like that:

    let e = anyhow::anyhow!("pote").context("prout");
    error!(error = e.as_ref() as &dyn std::error::Error);

It works, but its a bit verbose to have to cast at each error point (to be fair, not help by anyhow::Error not implementing Error). I believe error attributes are Stable ? https://opentelemetry.io/docs/specs/semconv/exceptions/exceptions-logs/

Yamakaky avatar Oct 01 '24 09:10 Yamakaky

I believe error attributes are Stable ?

Yes, it is but the otel semantic conventions crate will take a while to get stable, and till then we would like to avoid using it in the crates which will get stable sooner. Can we stick to implementing record_error for handling attribute values?

lalitb avatar Oct 06 '24 01:10 lalitb

@Yamakaky wondering if you have any further questions on this?

lalitb avatar Oct 22 '24 15:10 lalitb

Sorry, priorities changed at work, I haven't had the chance to look at it for now.

Yamakaky avatar Oct 31 '24 09:10 Yamakaky

OK, I believe we need to close this before stable release.

I was doing something like that:

    let e = anyhow::anyhow!("pote").context("prout");
    error!(error = e.as_ref() as &dyn std::error::Error);

It works, but its a bit verbose to have to cast at each error point (to be fair, not help by anyhow::Error not implementing Error).

Do you have any recommendation to avoid this obscure cast?

Sure. Some things I'm not sure about:

let error = ...;
error!(error);
error!(error, "some error happened");
error!(error, "some error happened: {}", error);
  • Which option do we recommend?

    • 1 would not have a log message, maybe have error.to_string() as default message value?
    • 3 makes it more verbose.
  • New exception.* fields are added, cf semconv. Do we keep the error field as duplicate? I'd say to remove it

  • What about if there are multiple errors fields, or if exception.* are manually added? I'd say to raise an error.

thinking out loud - how about ?

field_name.exception.message => value.to_string() field_name.exception.stacktrace => value_chain_str

this will allow us to handle multiple errors, as well as exception-* if any provided by user. But this then becomes too custom.

lalitb avatar Nov 20 '24 01:11 lalitb

Hum, there is two part to this, one of it in tracing itself.

Error recording

This is tracing specific, cf https://github.com/tokio-rs/tracing/issues/3067. I'm not sure what we can't do apart from doing an MR in tracing. Short term, I think I'll use something like error_as_dyn in the first message.

Note that #[instrument(err)] seems to record the error using Display and not visit_error.

Otel fields

Do we really need to support multiple errors? Maybe in a first version only one error can be recorded, and maybe a toggle in OpenTelemetryTracingBridge to enable multiple error recording.

As for the field names:

  • if only one error is supported, we can just use exception.*
  • if not, maybe prefix field_name. if field name is not error/err/e.

Field values:

  • message -> to_string() seems fine. Maybe also write it as the event message if not provided? ex in error!(error = error_as_dyn(&error)).
    • note: I don't think this is possible with the current otel api, there is no get_body/has_body.
  • for the stacktrace, maybe something like that for a start. Note that this doesn't support RUST_BACKTRACE with anyhow for example.
     fn record_error(&mut self, field: &tracing::field::Field, mut value: &(dyn std::error::Error + 'static)) {
        self.log_record
            .add_attribute(Key::new("exception.message"), AnyValue::from(format!("{value}")));
         let mut trace = format!("{value}\n");
         while let Some(e) = value.source() {
             writeln!(trace, "Caused by {e}").expect("writeln on string can't fail");
             value = e;
         }
        self.log_record
            .add_attribute(Key::new("exception.stacktrace"), AnyValue::from(trace));
     }

Yamakaky avatar Dec 03 '24 14:12 Yamakaky

@Yamakaky will you be continuing this ? Let us know if you have questions regarding implementation.

cijothomas avatar Feb 20 '25 19:02 cijothomas

Hey, sorry I missed the issue update. I'm not working on this for now.

Yamakaky avatar Mar 03 '25 08:03 Yamakaky