envoy
envoy copied to clipboard
Datadog Headers not propagated in GRPC requests from WASM filter
Title: Datadog Tracing Headers not propagated in GRPC requests in WASM filter
Description:
Envoy appears to omit Datadog trace context propagation in GRPC metadata of GRPC requests made to an external server from the WASM filter. This means that the trace context is not propagated to upstream servers called by the filter and any resulting traffic is unable to be stitched to the parent span. However, this seems to be supported in both ext_authz
and lua
filters.
Additionally, these headers are not injected in the http request received by the filter, as I think http tracing headers are not injected until it reaches the http router filter.
Repro steps:
- Enable datadog tracing and a WASM http filter that makes a call to an external server via GRPC.
- Inspect metadata on grpc requests received by external server. Only the following metadata is received in incoming context:
authority
,content-type
,x-envoy-expected-rq-timeout-ms
,x-envoy-internal
,x-forwarded-for
Please let me know if I can provide any further information to assist in resolution of this issue. Thanks!
i think it maybe the problem of implemention of grpcCall
in WASM filter, it always uses the instance of NullSpan (https://github.com/envoyproxy/envoy/blob/c103845992abc4d2bc2710bd23ea305e833c48e5/source/extensions/common/wasm/context.cc#L1019), which will not use Datadog. If this is a certain problem to fix, i can try this.
cc @PiotrSikora
@mburtless thanks for the report! It sounds like there 2 separate issues:
- Missing tracing info in callouts from Wasm (
httpCall
,grpcCall
andgrpcStream
) - this seems like a real issue, and we should definitely match behavior ofext_authz
andlua
. cc @chaoqin-li1123 forext_proc
, which doesn't seem to set it either. - Tracing info not visible in Wasm plugins - this is a bit more tricky, since we need to distinguish between Envoy-initiated tracing spans and HTTP headers received from clients. I think we might need a new Proxy-Wasm ABI calls to fix that, but let me think about it. Out of curiousity, why do you need this information inside Wasm plugins?
@StarryVae thanks for investigating! I think the fix might be setParentSpan()
on Http::AsyncClient::RequestOptions
, but perhaps grpcCall
also needs replacing Tracing::NullSpan::instance()
with activeSpan()
. If you want to send a PR with tests, that would be great!
@PiotrSikora Yeah, i will try to submit a PR to fix this, thanks.
Thanks to you both for addressing this so quickly!
@PiotrSikora - I think adding trace info to grpc callouts from wasm is really all that's required for our use case.
The only reason I was checking for tracing info within the Wasm plugin itself was to investigate a possible workaround where I manually apply the trace info to the grpc metadata, before making the callout. Totally unnecessary if this info is added to the callouts.
This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or "no stalebot" or other activity occurs. Thank you for your contributions.
I don't think this should be closed as I believe work is still occurring on this issue in #22075
This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or "no stalebot" or other activity occurs. Thank you for your contributions.
This issue has been automatically closed because it has not had activity in the last 37 days. If this issue is still valid, please ping a maintainer and ask them to label it as "help wanted" or "no stalebot". Thank you for your contributions.
@PiotrSikora - I think this issue is still relevant. Can this be marked appropriately to avoid it going stale?
@lizan could you reopen this and add appropriate tags?