apm-agent-nodejs
apm-agent-nodejs copied to clipboard
Transiting traceparent over custom transports
I'm implementing grpc-node support (specifically the grpc
npm package) with the new addPatch agent API.
I had guessed (incorrectly!) that if GRPC wasn't supported for apm-agent-nodejs
that it wasn't supported for any Elastic APM agent. I've found, however, that at least apm-agent-go
has GRPC support, so I want to make sure that my implementation is fully compatible for distributed tracing.
I have three questions:
- What is the officially supported way to obtain the correct APM context in
apm-agent-nodejs
to transit between services for distributed tracing?
I was looking at some of the existing code (link below) and am weary of using the _context
property of Span
found therein. Does the leading underscore mean this is not a public interface? If it is a private interface, is there an alternative way to obtain this information?
https://github.com/elastic/apm-agent-nodejs/blob/853c9ddde1ce2d671e3d478e103edeeac5f29539/lib/instrumentation/http-shared.js#L122-L130
- Why are S3 hosts blacklisted in the
http-shared
module?
https://github.com/elastic/apm-agent-nodejs/blob/853c9ddde1ce2d671e3d478e103edeeac5f29539/lib/instrumentation/http-shared.js#L181-L183
- When I transit this APM context from GRPC client to GRPC server, what is the correct way to do so?
Before I learned that GRPC was supported by some Elastic APM agents, I was just going to stuff the string representation of the context into some of the call's metadata, but looking more closely at apm-agent-go
(disclaimer: I do not know golang) it looks like the GRPC implementation depends on the header name from the HTTP implementation:
https://github.com/elastic/apm-agent-go/blob/master/module/apmgrpc/server.go#L97-L98
Which means that when this value changes in a future version of the agent pending W3C standard freeze, the GRPC metadata key will also be changed.
https://github.com/elastic/apm-agent-go/blob/master/module/apmhttp/traceheaders.go#L31-L36
So, how can I make sure that my implementation for node-grpc
is similarly responsive to this change?
Any other advice or pointers would be greatly appreciated!
There's a PR open right now to officially expose the header. https://github.com/elastic/apm-agent-nodejs/pull/969
As far as I know though, grpc just uses http2 internally, so it should be propagating that header already. I haven't tested the Node.js grpc module though, so it's possible there is context loss before it reaches that point.
@Qard
There's a PR open right now to officially expose the header. #969
Awesome :D
As far as I know though, grpc just uses http2 internally, so it should be propagating that header already. I haven't tested the Node.js grpc module though, so it's possible there is context loss before it reaches that point.
Are HTTP/2 request headers the standard way for transiting traceparent id for GRPC? That may not be possible with the grpc
npm package.
The grpc
npm package provides a binding to a native grpc implementation, and in a year of working with it I haven't seen a leakage that reveals the underlying HTTP/2 guts, and I would be surprised to learn that the native GRPC implementation uses an HTTP/2 implementation that exposes a javascript interface.
Right now I'm using metadata to transit the traceparent ID, but it's worrying to me that this won't be compatible with other agents.
Ah, yeah. I was only aware of the pure-js implementation, which uses the Node.js core http2
module. It looks like they have a native implementation too though. In that case, yes: you will need a way to inject the headers if you intend for this to be compatible with other instrumented services. I'm sad to say: it may be impossible if the library provides no method of manipulating the headers.
@watson Perhaps we should spend some time investigating this? We could maybe make a PR to the grpc module to expose the headers somehow.
@dvisztempacct
Right now I'm using metadata to transit the traceparent ID, but it's worrying to me that this won't be compatible with other agents.
The Go agent is also using metadata. I just did a proof of concept for a Node.js client interceptor, and confirmed that it works with the Go agent's instrumentation: https://gist.github.com/axw/bc5025035fbd2cea6a4516349be8a7ce. (I'm by no means a Node.js developer, sorry if the code is horrible.)
The span name "grpc" is just a placeholder, it should be gRPC method name.
@axw That's great news!
I'm afraid though that in your gist I don't understand why you're ending the span on line 14, effectively at the same time that you started it on line 9.
Am I misunderstanding something?
Thanks both :)
@dvisztempacct no, you're not misunderstanding; I just don't know how to JavaScript properly :wink: Just wanted to show that what you had intended to do will work with the Go agent's instrumentation, so long as you use the metadata key "elastic-apm-traceparent", and the same string representation used in HTTP.
The grpc npm package provides a binding to a native grpc implementation, and in a year of working with it I haven't seen a leakage that reveals the underlying HTTP/2 guts, and I would be surprised to learn that the native GRPC implementation uses an HTTP/2 implementation that exposes a javascript interface.
This is also the case with grpc-go. Metadata is encoded as HTTP headers (https://github.com/grpc/grpc/blob/master/doc/PROTOCOL-HTTP2.md), but that's abstracted.
@axw thanks :)
FYI: The API for getting the traceparent
is now public and has just landed in v2.9.0:
3 new getters have been added from which you can get the traceparent
string:
-
agent.currentTraceparent
-
span.traceparent
-
transaction.traceparent
@watson I don't see currentTraceparent
in https://www.elastic.co/guide/en/apm/agent/nodejs/current/agent-api.html is it just undocumented?
Do all three return the same traceparent string?
Thanks again :100:
The new docs are building as we speak... they take about 30 minutes to 1 hour to get online
You can see it in the docs from the master
branch: https://www.elastic.co/guide/en/apm/agent/nodejs/master/agent-api.html#apm-current-traceparent
@watson haha I guess you really meant it when you said it just landed ;)
haha yep 😇
... and yes, they all return the same type of string - the same we automatically add to outgoing http headers
I just noticed that one of the dependencies is probably also waiting on a release.. yarn said [email protected]
couldn't be found and prompted me to choose a different version :joy_cat:
That was released 10 hours ago, so it should be fine 🤔
If you look on npm it's there: https://www.npmjs.com/package/elastic-apm-http-client
I have no experience with yarn, but maybe it has an internal cache that somehow needs to be cleared?
Oh, I see. There's a couple of things that could have interfered, but I just retried and it worked :)
Whoops, I think this needs to remain open. We talked a bit about getting the traceparent
value but not about getting the traceparent
key (specifically, the string "Elastic-Apm-Traceparent", which is slated to later become the W3's choice, "Traceparent")
Related to https://github.com/elastic/apm-agent-nodejs/issues/1015
Hi did you manage to get it to work @dvisztempacct ?
I managed to get it to work with the interceptor here is my code with type script
import apm from 'elastic-apm-node/start'
import {TodoServiceClient} from "../../client/service_grpc_pb";
import {credentials} from "@grpc/grpc-js";
import {InterceptingCall} from "grpc";
const grpcAddr = process.env.BACKEND_ADDR || "backend:5000";
const traceInterceptor = function(options:any, nextCall:any) {
return new InterceptingCall(nextCall(options), {
start: function(metadata, listener, next) {
const span = apm.startSpan('grpc');
const traceParent = apm.currentTraceparent;
metadata.add('elastic-apm-traceparent', traceParent);
console.log("metadata:", metadata);
next(metadata, listener, next);
span.end();
}
});
}
export const client = new TodoServiceClient(
grpcAddr,
credentials.createInsecure(),
{
interceptors: [traceInterceptor]
}
);
But I wonder why the service span is outside of the parent tracing
But I wonder why the service span is outside of the parent tracing
@yulrizka Maybe the span is outside the parent tracing because you ended the span just after calling next, and grpc messaging is not a sync operation, so, the span is closed before the call is really started. Maybe if you close the span in the halfClose event it can work as intended.
I managed to implement the client side adding the following interceptor to the client:
const spanSymbol = Symbol('span');
export function traceInterceptor(options: any, nextCall: any) {
return new grpc.InterceptingCall(nextCall(options), {
start(metadata, listener, next) {
(this as any)[spanSymbol] = apm.startSpan('grpc');
const traceParent = apm.currentTraceparent;
if (traceParent) {
metadata.add('elastic-apm-traceparent', traceParent);
}
next(metadata, listener);
},
halfClose(next) {
(this as any)[spanSymbol]?.end();
next();
},
});
}
On the server side, I used the following code to start the transaction on each grpc method:
export function startApmTransaction(
call: ServerWriteableStream<unknown, unknown> | ServerUnaryCall<unknown>,
name: string,
) {
console.log(call.metadata.getMap());
const traceParent = (call.metadata.get('elastic-apm-traceparent')[0] ??
call.metadata.get('traceparent')[0]) as string | undefined;
const transaction = apm.startTransaction(name, {
childOf: traceParent,
});
return transaction;
}
At the end of the methods excution, of course, I needed to end the transaction. I'm using only unary and server side stream calls