opentelemetry-rust icon indicating copy to clipboard operation
opentelemetry-rust copied to clipboard

Use weaver for semantic convention codegen

Open lquerel opened this issue 1 year ago • 3 comments

There is an ongoing effort to migrate all code generation to Weaver (see this issue tracking this effort). In this context, I have already started the following PR to convert the existing code generation to Weaver. My goal is to begin with minimal changes (see the description in #2098).

In parallel with this first step, I would like to start a discussion on future developments I would like to propose for the Rust SIG:

  1. First Step: Collect feedback on #2098, fix any issues, and obtain approval.
  2. Second Step: Propose a layer on top of the standard Rust Client SDK to expose a type-safe semconv API as a complementary approach to the current one. There is a proof of concept in the Weaver repo for Rust (see this test which tests the type-safe API layer generated by Weaver. The templates, used to generate the type-safe Client SDK, are available here).
  3. Third Step: Based on community feedback, invest more time in developing a highly optimized type-safe semconv API, fully integrated with the low-level Rust Client SDK. In theory, we could replace hashmaps with structs, remove layers of abstraction, and improve the user experience with better integration with IDEs, code auto-completion, and AI assistance. This would require significantly more work, of course.

Below is an example of what such a type-safe API generated from semconv and Weaver could look like. By the way, Weaver is also a Rust project, so feel free to add a few stars to support us.

use crate::attributes::client;
use crate::attributes::http::HttpRequestMethod;
use crate::attributes::http::HTTP_REQUEST_METHOD;
use crate::attributes::system::SystemCpuState;
use crate::metrics::http::create_http_client_request_duration;
use crate::metrics::http::HttpClientActiveRequests;
use crate::metrics::http::HttpClientActiveRequestsReqAttributes;
use crate::metrics::http::HttpServerRequestDuration;
use crate::metrics::http::HttpServerRequestDurationOptAttributes;
use crate::metrics::http::HttpServerRequestDurationReqAttributes;
use crate::metrics::system::SystemCpuTime;
use crate::metrics::system::SystemCpuTimeOptAttributes;
use crate::metrics::system::SystemCpuUtilization;
use crate::metrics::system::SystemCpuUtilizationOptAttributes;
use opentelemetry::metrics::Histogram;
use opentelemetry::{global, KeyValue};

fn main() {
    let meter = global::meter("my_meter");

    // Create a u64 http.server.request.duration metric (as defined in the OpenTelemetry HTTP
    // semantic conventions).
    // The API is type-safe, so the compiler will catch type errors. The required attributes are
    // enforced by the compiler. All the attributes provided are checked for correctness by the
    // compiler in relation to the original semantic convention.
    let http_request_duration = HttpServerRequestDuration::<u64>::new(&meter);

    // Records a new data point and provide the required and some optional attributes
    http_request_duration.record(
        100,
        &HttpServerRequestDurationReqAttributes {
            http_request_method: HttpRequestMethod::Connect,
            url_scheme: "http".to_owned(),
        },
        Some(&HttpServerRequestDurationOptAttributes {
            http_response_status_code: Some(200),
            ..Default::default()
        }),
    );

    // ==== A TYPE-SAFE UP-DOWN-COUNTER API ====
    // Create a f64 http.server.request.duration metric (as defined in the OpenTelemetry HTTP
    // semantic conventions)
    let http_client_active_requests = HttpClientActiveRequests::<f64>::new(&meter);
    // Adds a new data point and provide the required attributes. Optional attributes are not
    // provided in this example.
    http_client_active_requests.add(
        10.0,
        &HttpClientActiveRequestsReqAttributes {
            server_address: "10.0.0.1".to_owned(),
            server_port: 8080,
        },
        None,
    );

    // ==== A TYPE-SAFE COUNTER API ====
    // Create a f64 system.cpu.time metric (as defined in the OpenTelemetry System semantic
    // conventions)
    let system_cpu_time = SystemCpuTime::<f64>::new(&meter);
    // Adds a new data point and provide some optional attributes.
    // Note: In the method signature, there is no required attribute.
    system_cpu_time.add(
        10.0,
        Some(&SystemCpuTimeOptAttributes {
            system_cpu_logical_number: Some(0),
            system_cpu_state: Some(SystemCpuState::Idle),
        }),
    );
    // Adds a new data point with a custom CPU state.
    system_cpu_time.add(
        20.0,
        Some(&SystemCpuTimeOptAttributes {
            system_cpu_logical_number: Some(0),
            system_cpu_state: Some(SystemCpuState::_Custom("custom".to_owned())),
        }),
    );

    // ==== A TYPE-SAFE GAUGE API ====
    // Create a i64 system.cpu.utilization metric (as defined in the OpenTelemetry System semantic
    // conventions)
    let system_cpu_utilization = SystemCpuUtilization::<i64>::new(&meter);
    // Adds a new data point with no optional attributes.
    system_cpu_utilization.record(-5, None);
    // Adds a new data point with some optional attributes.
    system_cpu_utilization.record(
        10,
        Some(&SystemCpuUtilizationOptAttributes {
            system_cpu_logical_number: Some(0),
            system_cpu_state: Some(SystemCpuState::Idle),
        }),
    );
}

lquerel avatar Sep 11 '24 06:09 lquerel

invest more time in developing a highly optimized type-safe semconv API, fully integrated with the low-level Rust Client SDK. In theory, we could replace hashmaps with structs, remove layers of abstraction, and improve the user experience with better integration with IDEs, code auto-completion, and AI assistance. This would require significantly more work, of course.

I think this is well aligned with Rust wants to achieve - high performance + type safety.

reyang avatar Sep 11 '24 16:09 reyang

In theory, we could replace hashmaps with structs, remove layers of abstraction, and improve the user experience with better integration with IDEs, code auto-completion, and AI assistance.

I think so long as this is done generically in a way that other libraries integrating with the OTel SDK can still programmatically construct signal entities from the data it has, which may be in some kind of map, this all sounds good to me 👍

KodrAus avatar Sep 11 '24 21:09 KodrAus

@KodrAus

I think so long as this is done generically in a way that other libraries integrating with the OTel SDK can still programmatically construct signal entities from the data it has, which may be in some kind of map, this all sounds good to me 👍

This is the idea: a developer will be able to use either the current generic OTel Client SDK, the type-safe SemConv Client SDK, or both simultaneously. I created a proof of concept (POC) for this approach in Weaver (crates/weaver_codegen_test). In the future, we could refine and optimize the type-safe SemConv Client SDK to bypass certain parts of the generic client SDK for improved performance. However, we don’t need to do this initially, and we can still gain the benefits of a type-safe API.

lquerel avatar Sep 11 '24 23:09 lquerel

@lquerel We are quite interested in advancing this forward. What would you say is the biggest blocker right now towards adopting and generating type-safe metrics in opentelemetry-semantic-conventions?

Additionally, would similar approach work for application metrics? What we want to archive is allow application developers to define their metrics schema in yaml format and have a simple way to generate typed metrics from such schema. Ideally, with minimum friction, for example leveraging build.rs.

What would move us closer towards this goal and can we help?

martintmk avatar Feb 11 '25 07:02 martintmk