datafusion-ballista
datafusion-ballista copied to clipboard
Support trace id for each query
Is your feature request related to a problem or challenge? Please describe what you are trying to do. As a distributed query engine, many sql query will submit to the scheduler, and the scheduler or executor server will log some message for each query.
But from current status, we can't find out the logs corresponding to the specified query, we should search all the log and find the useful information for our query.
If we have the trace id or the query id, we can log the trace id or query in the log event format
, just like
[time trace id code paht] : log info
.
From above format, we can find out all of query log for the specified query.
@andygrove @alamb
Do you have other ideas?
Describe the solution you'd like A clear and concise description of what you want to happen.
Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.
Additional context Add any other context or screenshots about the feature request here.
@liukun4515 I think adding trace_id and query_id to the log entries would be very valuable.
I think such information exists in the SessionContext
though to use it all calls to info!
, warn!
, etc log macros would need to have a ctx parameter threaded through. I think the tracing!
library has some way to avoid threading using a thread local variable, but I am not sure if works with async
🤔
In IOx we went with a somewhat more sophisticated "distributed tracing" infrastructure of https://www.jaegertracing.io/ and have code that converts the query metrics from datafusion into distributed traces: https://github.com/influxdata/influxdb_iox/blob/main/iox_query/src/exec/query_tracing.rs#L111
@liukun4515 I think adding trace_id and query_id to the log entries would be very valuable.
I think such information exists in the
SessionContext
though to use it all calls toinfo!
,warn!
, etc log macros would need to have a ctx parameter threaded through. I think thetracing!
library has some way to avoid threading using a thread local variable, but I am not sure if works withasync
🤔
thanks for you reply @alamb
Does the tracing lib
refer to the tokio-tracing
?
In IOx we went with a somewhat more sophisticated "distributed tracing" infrastructure of https://www.jaegertracing.io/ and have code that converts the query metrics from datafusion into distributed traces: https://github.com/influxdata/influxdb_iox/blob/main/iox_query/src/exec/query_tracing.rs#L111
From the usage of IOx, I got how to get metric or trace from the datafusion lib. It is good way to get trace.
Does the tracing lib refer to the tokio-tracing?
Yes
From the usage of IOx, I got how to get metric or trace from the datafusion lib. It is good way to get trace.
👍