opentelemetry-rust
opentelemetry-rust copied to clipboard
[Feature]: Implement error recording on opentelemetry-appender-tracing
Related Problems?
No response
Describe the solution you'd like:
Currently, opentelemetry-appender-tracing records errors using their Debug implementation. The cause chain has to be added manually, which is a pain to do at every log point.
I believe this would be done in https://github.com/open-telemetry/opentelemetry-rust/blob/main/opentelemetry-appender-tracing/src/layer.rs#L72.
Otel conventions for errors: https://opentelemetry.io/docs/specs/semconv/exceptions/exceptions-logs/
tracing-opentelemetry implementation for reference: https://github.com/tokio-rs/tracing-opentelemetry/blob/v0.1.x/src/layer.rs#L297
Considered Alternatives
No response
Additional Context
No response
@Yamakaky That's a fair ask. Would you like to contribute?
Sure. Some things I'm not sure about:
let error = ...;
error!(error);
error!(error, "some error happened");
error!(error, "some error happened: {}", error);
- Which option do we recommend?
- 1 would not have a log message, maybe have
error.to_string()as default message value? - 3 makes it more verbose.
- 1 would not have a log message, maybe have
- New
exception.*fields are added, cf semconv. Do we keep theerrorfield as duplicate? I'd say to remove it - What about if there are multiple errors fields, or if
exception.*are manually added? I'd say to raise an error.
@Yamakaky . Regarding:
let error = ...;
error!(error);
error!(error, "some error happened");
error!(error, "some error happened: {}", error);
I am confused as none of the above you shared are the valid error calls. They won't compile.
I can think of below scenarios:
error!("An error occurred: {}", e); -> Uses Display, not triggering record_error.
error!(error = ?e, "An error occurred"); -> Uses Debug, not triggering record_error.
error!(error = &e as &dyn Error, "An error occurred"); -> This would trigger record_error (3)
error!(error = &e as &dyn Error); -> This would trigger record_error (4)
And in 4th case above, we don't need any default message. Just the error attribute is enough.
Also, we don't need to support semconv for exception for now, as they are still experimental.
I was doing something like that:
let e = anyhow::anyhow!("pote").context("prout");
error!(error = e.as_ref() as &dyn std::error::Error);
It works, but its a bit verbose to have to cast at each error point (to be fair, not help by anyhow::Error not implementing Error). I believe error attributes are Stable ? https://opentelemetry.io/docs/specs/semconv/exceptions/exceptions-logs/
I believe error attributes are Stable ?
Yes, it is but the otel semantic conventions crate will take a while to get stable, and till then we would like to avoid using it in the crates which will get stable sooner. Can we stick to implementing record_error for handling attribute values?
@Yamakaky wondering if you have any further questions on this?
Sorry, priorities changed at work, I haven't had the chance to look at it for now.
OK, I believe we need to close this before stable release.
I was doing something like that:
let e = anyhow::anyhow!("pote").context("prout"); error!(error = e.as_ref() as &dyn std::error::Error);It works, but its a bit verbose to have to cast at each error point (to be fair, not help by anyhow::Error not implementing Error).
Do you have any recommendation to avoid this obscure cast?
Sure. Some things I'm not sure about:
let error = ...; error!(error); error!(error, "some error happened"); error!(error, "some error happened: {}", error);
Which option do we recommend?
- 1 would not have a log message, maybe have
error.to_string()as default message value?- 3 makes it more verbose.
New
exception.*fields are added, cf semconv. Do we keep theerrorfield as duplicate? I'd say to remove itWhat about if there are multiple errors fields, or if
exception.*are manually added? I'd say to raise an error.
thinking out loud - how about ?
field_name.exception.message => value.to_string()
field_name.exception.stacktrace => value_chain_str
this will allow us to handle multiple errors, as well as exception-* if any provided by user. But this then becomes too custom.
Hum, there is two part to this, one of it in tracing itself.
Error recording
This is tracing specific, cf https://github.com/tokio-rs/tracing/issues/3067. I'm not sure what we can't do apart from doing an MR in tracing. Short term, I think I'll use something like error_as_dyn in the first message.
Note that #[instrument(err)] seems to record the error using Display and not visit_error.
Otel fields
Do we really need to support multiple errors? Maybe in a first version only one error can be recorded, and maybe a toggle in OpenTelemetryTracingBridge to enable multiple error recording.
As for the field names:
- if only one error is supported, we can just use
exception.* - if not, maybe prefix
field_name.if field name is noterror/err/e.
Field values:
- message -> to_string() seems fine. Maybe also write it as the event message if not provided? ex in
error!(error = error_as_dyn(&error)).- note: I don't think this is possible with the current otel api, there is no
get_body/has_body.
- note: I don't think this is possible with the current otel api, there is no
- for the stacktrace, maybe something like that for a start. Note that this doesn't support RUST_BACKTRACE with anyhow for example.
fn record_error(&mut self, field: &tracing::field::Field, mut value: &(dyn std::error::Error + 'static)) {
self.log_record
.add_attribute(Key::new("exception.message"), AnyValue::from(format!("{value}")));
let mut trace = format!("{value}\n");
while let Some(e) = value.source() {
writeln!(trace, "Caused by {e}").expect("writeln on string can't fail");
value = e;
}
self.log_record
.add_attribute(Key::new("exception.stacktrace"), AnyValue::from(trace));
}
@Yamakaky will you be continuing this ? Let us know if you have questions regarding implementation.
Hey, sorry I missed the issue update. I'm not working on this for now.