opentelemetry-rust icon indicating copy to clipboard operation
opentelemetry-rust copied to clipboard

Eliminate Dynamic Dispatching in Log Pipeline for Performance Optimization

Open lalitb opened this issue 1 year ago • 3 comments

Currently, the log pipeline employs dynamic dispatching within its workflow:

                        (dynamic)                                      (dynamic)
Logger::emit(record)  ------------> LogProcessor::emit(data)   ------------->   LogExporter::export(batch)

While it is possible to implement custom LogProcessor to avoid the second dynamic dispatch, the first dynamic dispatch remains unavoidable. This overhead should be eliminated.

This issue has been created to track the necessary improvements.

Reference:

  1. LogEmitter.rs: https://github.com/open-telemetry/opentelemetry-rust/blob/a9b8621cd1004e7f083002071b9f9a57748c6b99/opentelemetry-sdk/src/logs/log_emitter.rs#L140-L144

  2. SimpleLogProcessor: https://github.com/open-telemetry/opentelemetry-rust/blob/a9b8621cd1004e7f083002071b9f9a57748c6b99/opentelemetry-sdk/src/logs/log_processor.rs#L78-L81

  3. BatchLogProcessor: https://github.com/open-telemetry/opentelemetry-rust/blob/a9b8621cd1004e7f083002071b9f9a57748c6b99/opentelemetry-sdk/src/logs/log_processor.rs#L200-L202

lalitb avatar Jul 17 '24 18:07 lalitb

Assuming SimpleConcurrentProcessor is part of the specs. The simple solution could be

  • Add static dispatch for SimpleLogProcessor and SimpleConcurrentProcessor because their types are known at compile time.
  • We require dynamic dispatch for BatchLogProcessor as its runtime type can vary based on configuration, making it difficult to use a static type.
  • Any other custom log processor would be dispatched at runtime.
enum LogProcessorEnum {
    SimpleLogProcessor(SimpleLogProcessor),
    SimpleConcurrentProcessor(SimpleConcurrentProcessor),
    Batch(BatchLogProcessor<Box<dyn RuntimeChannel>>),
    DynLogProcessor(Box<dyn LogProcessor>),

}

impl LogProcessor for LogProcessorEnum {
    fn emit(&self, data: &mut LogData) {
        match self {
            LogProcessorEnum::SimpleLogProcessor(p) => p.emit(data),
            LogProcessorEnum::SimpleConcurrentProcessor(p) => p.emit(data),
            LogProcessorEnum::Batch(p) => p.emit(data),
            LogProcessorEnum::DynLogProcessor(p) => p.emit(data),
        }
    }
    // --- and other methods ForceFlush() and Shutdown() similarly implemented.
}
#[derive(Debug)]
struct LoggerProviderInner {
    processors: Vec<LogProcessorEnum>,
    resource: Resource,
}

Because of the async-runtime dependency, it is difficult to have the static dispatch for Batch Processor. However, it can be implemented through some macro , which would register this processor with async-runtime at compile time. Something like:

macro_rules! register_batch_log_processor {
    ($runtime:ty) => {
        enum LogProcessorEnum {
            Simple(SimpleLogProcessor),
            Batch(BatchLogProcessor<$runtime>),
            // and others
        }

 /// and called as:
register_batch_log_processor!(TokioRuntime);

lalitb avatar Aug 05 '24 19:08 lalitb

Assuming https://github.com/open-telemetry/opentelemetry-specification/pull/4163 is part of the specs.

We don't need it to be part of the spec, it is sufficient that it is part of opentelemtry-sdk crate. Spec does not prohibit additional processors.

cijothomas avatar Aug 05 '24 19:08 cijothomas

Looked at this more closely, and we don't have public API risk with this. The method used by Logger to get a list of processors form the provider is not public, so we should be able to change this in back-compat way in future. Removing from Log SDK Stable milestone due to this.

cijothomas avatar Mar 11 '25 01:03 cijothomas