tyk icon indicating copy to clipboard operation
tyk copied to clipboard

[TT-13406] Attaching a span to the trace started in the Tyk API Gateway from a grpc plugin when Open Telemetry is enabled.

Open ls-adrian-chong opened this issue 1 year ago • 10 comments

Is your feature request related to a problem? Please describe. When Open Telemetry is enabled, I would like to be able to have a gRPC plugin add a span to the trace created by the Tyk API Gateway. Also, add trace context details from the gRPC plugin.

Currently, there is no trace context provided to the gRPC plugin so if a new span is created from within the gRPC plugin, it is not a child of the parent trace started in the Tyk API Gateway, instead a whole new trace is created.

Describe the solution you'd like A way to be able to attach a span to the parent trace created by the Tyk API Gateway from the gRPC plugin. Suppose this could be the trace id sent as a field in the context.

Thinking setting details on the trace context would also be possible if the trace context was provided.

Describe alternatives you've considered Enabling detailed tracing in the api definition - nothing from the gRPC plugin. Using ddtrace from within the gRPC plugin - results in a new trace that is unconnected to the one started from the Tyk API Gateway.

Reference: https://community.tyk.io/t/opentelemetry-trace-with-grpc-plugin/7695/2

ls-adrian-chong avatar Sep 09 '24 14:09 ls-adrian-chong

I think somthehing like this possible only with Go plugin, which has access to full native request context 🤔

buger avatar Sep 09 '24 15:09 buger

I think somthehing like this possible only with Go plugin, which has access to full native request context 🤔

Thinking if the trace id is sent to the gRPC plugin, then the plugin could create a span with context that is attached to the parent trace when it emits the span to Datadog.

ls-adrian-chong avatar Sep 09 '24 15:09 ls-adrian-chong

As far as I understand, the tracing context is not correctly passed within the Golang context as grpc metadata when using the grpc client?

boser87 avatar Dec 17 '24 14:12 boser87

Are there any chances to prioritize this feature request any time soon?

kercsoj avatar Jul 30 '25 10:07 kercsoj

/probe pls do you review, on complexity, or maybe some workaround already possible?

buger avatar Jul 30 '25 11:07 buger

Analysis of OpenTelemetry Trace Context Propagation to gRPC Plugins

Issue Summary

The issue is about enabling OpenTelemetry trace context propagation between Tyk API Gateway and gRPC plugins. Currently, when OpenTelemetry is enabled in Tyk, traces start in the gateway, but gRPC plugins can't attach spans to these traces. Instead, any tracing from gRPC plugins creates new, disconnected traces.

Current Implementation

The current implementation has these key components:

  1. OpenTelemetry in Tyk Gateway: The gateway creates traces using OpenTelemetry in gateway/mw_opentelemetry_tracing.go and gateway/tracing/opentelemetry.go.

  2. gRPC Plugin Communication: The gateway communicates with gRPC plugins through the GRPCDispatcher in gateway/coprocess_grpc.go, which uses context.Background() for all gRPC calls:

func (d *GRPCDispatcher) Dispatch(object *coprocess.Object) (*coprocess.Object, error) {
    return grpcClient.Dispatch(context.Background(), object)
}
  1. Missing Trace Context: The trace context from the HTTP request is not propagated to the gRPC call, causing the disconnection between traces.

Complexity Assessment

The implementation complexity is moderate:

  1. Code Changes Required:

    • Modify the GRPCDispatcher to accept and use the request context instead of context.Background()
    • Add trace context propagation to gRPC metadata
    • Extend the coprocess.Object to include trace information or use gRPC metadata
  2. Backward Compatibility: Changes need to maintain compatibility with existing plugins

  3. Cross-Language Support: The solution needs to work with different plugin languages (Python, Ruby, etc.)

Proposed Solution

The most straightforward approach would be:

  1. Modify the Dispatcher Interface: Update the coprocess.Dispatcher interface to include the request context:
type Dispatcher interface {
    Dispatch(ctx context.Context, object *Object) (*Object, error)
    // Other methods...
}
  1. Propagate Trace Context via gRPC Metadata: Use the OpenTelemetry propagation API to inject trace context into gRPC metadata:
func (d *GRPCDispatcher) Dispatch(ctx context.Context, object *coprocess.Object) (*coprocess.Object, error) {
    // Create a new outgoing context with trace information
    md := metadata.New(nil)
    otel.GetTextMapPropagator().Inject(ctx, metadata.NewOutgoingContext(ctx, md))
    return grpcClient.Dispatch(ctx, object)
}
  1. Extract Trace Context in Plugins: Provide helper functions in plugin SDKs to extract the trace context and create child spans.

Workaround for Current Versions

Until the feature is implemented, users could:

  1. Pass Trace IDs in Object Metadata: Modify the gateway code to include trace IDs in the coprocess.Object.Metadata map:
// In the middleware before dispatching to gRPC
span := otel.SpanFromContext(r.Context())
if span.SpanContext().HasTraceID() {
    if object.Metadata == nil {
        object.Metadata = make(map[string]string)
    }
    object.Metadata["x-trace-id"] = span.SpanContext().TraceID().String()
    object.Metadata["x-span-id"] = span.SpanContext().SpanID().String()
}
  1. Use the Trace IDs in Plugins: In the plugin, extract these IDs and use them to create spans with the correct parent:
# Python example
def process_request(coprocess_object):
    trace_id = coprocess_object.metadata.get("x-trace-id")
    span_id = coprocess_object.metadata.get("x-span-id")
    
    # Use these IDs to create a span with the correct parent
    # This depends on the OpenTelemetry SDK being used
    
    return coprocess_object

This workaround requires custom code in both the gateway and plugins but can bridge the gap until proper propagation is implemented.

Conclusion

The issue is solvable with moderate effort by enhancing the gRPC dispatcher to propagate trace context via gRPC metadata. For now, a workaround using object metadata can help users connect their traces between the gateway and plugins.

The proper implementation would leverage the OpenTelemetry propagation APIs to ensure standard-compliant trace context propagation across service boundaries.


Tip: Mention me again using /probe <request>. Powered by Probe AI

github-actions[bot] avatar Jul 30 '25 12:07 github-actions[bot]

/probe so basically in workaround solution, it is responsibility of the plugin to communicate to opentelemetry, and in the first solution it is more native, kind of? How this first solution, will look from the the view of gRPC plugin, can you give some example, lets say with Python? Will we need to modify protobuf too?

buger avatar Jul 30 '25 12:07 buger

Implementing OpenTelemetry Trace Context Propagation for gRPC Plugins in Tyk

Based on my analysis of the Tyk codebase, I can provide a detailed explanation of how to implement OpenTelemetry trace context propagation for gRPC plugins, addressing the issue described in TT-13406.

Current Implementation

Currently, when OpenTelemetry is enabled in Tyk API Gateway, trace context is not propagated to gRPC plugins. The issue occurs in the GRPCDispatcher.Dispatch method in gateway/coprocess_grpc.go:

func (d *GRPCDispatcher) Dispatch(object *coprocess.Object) (*coprocess.Object, error) {
    return grpcClient.Dispatch(context.Background(), object)
}

This method creates a new context.Background() for each gRPC call, discarding any trace context from the original request. As a result, any spans created within the gRPC plugin cannot be attached to the parent trace started in the Tyk API Gateway.

Solution: Native Trace Context Propagation

The solution involves two main changes:

  1. Modify the Dispatcher interface to accept a context parameter:

    • Update the Dispatcher interface in coprocess/dispatcher.go
    • Update all implementations of this interface
  2. Propagate trace context through gRPC metadata:

    • Extract trace context from the request
    • Inject it into gRPC metadata
    • Extract it in the gRPC plugin

Implementation Details

1. Update the Dispatcher Interface

// In coprocess/dispatcher.go
type Dispatcher interface {
    // Update to include context
    Dispatch(ctx context.Context, object *Object) (*Object, error)
    // Other methods remain the same
    DispatchEvent([]byte)
    // ...
}

2. Modify the GRPCDispatcher Implementation

// In gateway/coprocess_grpc.go
func (d *GRPCDispatcher) Dispatch(ctx context.Context, object *coprocess.Object) (*coprocess.Object, error) {
    // Create a new outgoing context with trace information
    md := metadata.New(nil)
    otel.GetTextMapPropagator().Inject(ctx, metadata.NewOutgoingContext(ctx, md))
    
    // Use the context with trace information for the gRPC call
    return grpcClient.Dispatch(metadata.NewOutgoingContext(ctx, md), object)
}

3. Update the CoProcessor to Pass the Request Context

// In gateway/coprocess.go
func (c *CoProcessor) Dispatch(object *coprocess.Object) (*coprocess.Object, error) {
    // Get the request context from middleware
    ctx := c.Middleware.Req.Context()
    
    dispatcher := loadedDrivers[c.Middleware.MiddlewareDriver]
    if dispatcher == nil {
        return nil, errors.New("no plugin driver available")
    }

    // Pass the context to the dispatcher
    return dispatcher.Dispatch(ctx, object)
}

4. Add Trace Context to Object Metadata

For backward compatibility and to make trace context available to plugins that don't directly use the gRPC metadata:

// In gateway/coprocess.go - in BuildObject method
func (c *CoProcessor) BuildObject(req *http.Request, res *http.Response, spec *APISpec) (*coprocess.Object, error) {
    // Existing code...
    
    // Add trace context to metadata
    if span := trace.SpanFromContext(req.Context()); span.SpanContext().IsValid() {
        if object.Metadata == nil {
            object.Metadata = make(map[string]string)
        }
        object.Metadata["x-trace-id"] = span.SpanContext().TraceID().String()
        object.Metadata["x-span-id"] = span.SpanContext().SpanID().String()
        
        // Add trace flags if needed
        if span.SpanContext().IsSampled() {
            object.Metadata["x-trace-sampled"] = "1"
        }
    }
    
    return object, nil
}

Python Plugin Example

Here's how a Python gRPC plugin would use this trace context:

import grpc
from opentelemetry import trace
from opentelemetry.propagate import extract
from opentelemetry.trace.propagation.tracecontext import TraceContextTextMapPropagator

class DispatcherServicer(coprocess_object_pb2_grpc.DispatcherServicer):
    def Dispatch(self, request, context):
        # Extract trace context from gRPC metadata
        metadata = dict(context.invocation_metadata())
        
        # Create a propagator
        propagator = TraceContextTextMapPropagator()
        
        # Extract the context
        ctx = extract(metadata)
        
        # Get the current tracer
        tracer = trace.get_tracer(__name__)
        
        # Start a new span as child of the extracted context
        with tracer.start_as_current_span("grpc_plugin_operation", context=ctx) as span:
            # Your plugin logic here
            span.set_attribute("custom.attribute", "value")
            
            # Process the request
            # ...
            
            return response

Alternatively, if the plugin doesn't have direct access to the gRPC metadata, it can use the metadata in the Object:

def process_request(obj):
    # Extract trace context from object metadata
    trace_id = obj.metadata.get("x-trace-id")
    span_id = obj.metadata.get("x-span-id")
    
    if trace_id and span_id:
        # Create a context with the trace information
        # (Implementation depends on the OpenTelemetry SDK being used)
        
        # Start a new span as child of the extracted context
        with tracer.start_as_current_span("plugin_operation", context=ctx):
            # Your plugin logic here
            # ...
    
    return obj

Conclusion

This implementation provides a complete solution for propagating OpenTelemetry trace context from the Tyk API Gateway to gRPC plugins. It allows plugins to create spans that are properly attached to the parent trace, enabling end-to-end tracing across the entire request flow.

The solution is backward compatible with existing plugins and provides two methods for accessing trace context:

  1. Through gRPC metadata (for plugins that can access it)
  2. Through the Object's metadata field (for all plugins)

This approach follows OpenTelemetry best practices for context propagation across service boundaries and should work with any OpenTelemetry-compatible tracing backend.


Tip: Mention me again using /probe <request>. Powered by Probe AI

github-actions[bot] avatar Jul 30 '25 12:07 github-actions[bot]

Based on Probe AI’s analysis, is there any chance this feature request can be prioritized in the Tyk Gateway backlog? Or, if an external contributor implements it in a fork, is there a realistic chance to getting it merged into the main Tyk Gateway repository?

kercsoj avatar Sep 05 '25 12:09 kercsoj

It would be super nice to prio this one. Also, I don't think right now we have any workaround to reach the same behaviour with only using grpc middlewares or even native plugins right?

boser87 avatar Sep 08 '25 09:09 boser87