opentelemetry-js icon indicating copy to clipboard operation
opentelemetry-js copied to clipboard

Add responseBodyHook for http request and response

Open haddasbronfman opened this issue 2 years ago • 7 comments

Is your feature request related to a problem? Please describe.

I work on otel distribution, and as part of our HTTP instrumentation, we want to have the HTTP request body and the HTTP response body. These apparently are not easy to get. This is because otel is closing the span before the 'end' callback on my request/response has been triggered. This is happening here and here

Describe the solution you'd like

I want to add a responseBodyHook of this type:

export interface HttpResponseBodyFunction {
    (span: Span, response: IncomingMessage | ServerResponse, body: String): void;
}

(I think the body should be of type to be String | Buffer, but for POC I tried it with 'String')

Describe alternatives you've considered

In order to get the body, I tried to override the 'end' function in the same way otel overrides it. something like this:

const originalEnd = clientRequest.end;
clientRequest.end = function (..._args: ResponseEndArgs) {
        clientRequest.end = originalEnd;
        span.setAttribute('http.request.body', _args[0] as AttributeValue);
        return clientRequest.end.apply(this, arguments as never);
};

This works but not for all cases, and I think the solution should come from otel, because it shouldn't close the span before the user got the body. The user should have the opportunity to have the body and add attributes according to it to his span.

Additional context

I would like to solve this issue, if this is acceptable can you assign it to me?

haddasbronfman avatar Jul 21 '22 15:07 haddasbronfman

Are requestHook and responseHook sufficient for the problem? You can add body listeners on the given request and response.

legendecas avatar Jul 25 '22 02:07 legendecas

I think they are not sufficient. When I tried to use them I added this code to my responseHook:

      const listener = (chunk: any) => {
        bodyHandler.addChunk(chunk); //bodyHandler aggregates the chunks
      };
      response.on('data', listener);

      response.on('end', () => {
        span.setAttribute(SemanticAttributes.HTTP_RESPONSE_BODY, bodyHandler.getChunks());
        response.removeListener('data', listener);
      });

but when the end callback was called, the span was already closed and setAttribute failed. This happens because the end function (overridden in line 448) is closing the span, and only after that, my end callback is called.

haddasbronfman avatar Jul 25 '22 06:07 haddasbronfman

I believe the problem is that response.end is asynchronous -- however, in the instrumentation, it is been treated as a synchronous method.

legendecas avatar Jul 25 '22 08:07 legendecas

What do you mean by "treated as a synchronous method"? I see that my end callback is called after the span is closed. what can I do differently?

haddasbronfman avatar Jul 25 '22 09:07 haddasbronfman

response.end accepts a callback as the last parameter. Only when the callback is invoked, we can safely assert that the data passed to response.end is sent to the other end. So, I think the span should be marked as ended when the callback of response.end is invoked, which should solve the problem in the OP. I'd believe this is a bug of the http-instrumentation.

legendecas avatar Jul 25 '22 14:07 legendecas

But this callback argument is optional

haddasbronfman avatar Jul 27 '22 06:07 haddasbronfman

It is asynchronous regardless of the presence of the optional callback.

legendecas avatar Jul 29 '22 06:07 legendecas

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 14 days.

github-actions[bot] avatar Oct 03 '22 06:10 github-actions[bot]

@legendecas is there some reason you think this is a bug? If so can you add a priority label for it?

dyladan avatar Oct 12 '22 16:10 dyladan

Yeah, the http-instrumentation didn't end the span in the end callback of the response, which may end up finishing the span earlier in conditions mentioned in https://github.com/open-telemetry/opentelemetry-js/issues/3104#issuecomment-1193641624.

I'm tagging this as https://github.com/open-telemetry/opentelemetry-js/labels/priority%3Ap2 as it causes correctness issues.

legendecas avatar Oct 13 '22 02:10 legendecas

@haddasbronfman I've submitted a fix at #3407. Would you mind verifying if it meets your use case? Thank you.

legendecas avatar Nov 14 '22 08:11 legendecas

Hi, I'm no longer working on this so I can't answer for sure. thanks for your solution!

haddasbronfman avatar Nov 22 '22 06:11 haddasbronfman

@legendecas @haddasbronfman @dyladan Hey folks, I've been trying to capture the body by using responseHook. However not able to make this work, any inputs/directions please ?

 responseHook: (span, response) => {
          response.on('data', (chunk) => {
            if(!response['myBody']){
              response['myBody'] = [];
              response['myBody'].push(chunk);
            }
            else{
              response['myBody'].push(chunk);
            }
          });
          response.on('end', () => {
            const responseBody = Buffer.concat(response['myBody']).toString();
            console.log('respnse body is', responseBody);
            console.log('response headers are', response.headers);
            span.setAttribute('http.response.body',responseBody);
            span.setAttributes('http.response.headers', response.headers);

          });

gmann42 avatar Jan 27 '23 15:01 gmann42

@gmann42 response is a writable and writable emit finish when the write has completed.

legendecas avatar Jan 31 '23 03:01 legendecas

@gmann42 response is a writable and writable emit finish when the write has completed.

@legendecas Thanks, I don't quite follow will look at docs.

But I've managed to do this one level above by using opentelemetry-express. Patched the send method of response object from express to add the body as a span attribute.

gmann42 avatar Jan 31 '23 12:01 gmann42

@haddasbronfman I've submitted a fix at #3407. Would you mind verifying if it meets your use case? Thank you.

I confirm the following works fine:

const responseHook = async (
  span: api.Span,
  response: IncomingMessage | ServerResponse,
) => {
  let body = '';
  response.on('data', (chunk: Buffer) => {
    body += chunk.toString();
  });
  response.on('end', () => {
    span.setAttribute('http.response.body', body);
    response.removeAllListeners();
  });
};

u11d-michal-miler avatar Apr 14 '23 07:04 u11d-michal-miler

@u11d-michal-miler I no longer use this use case.

haddasbronfman avatar Apr 16 '23 07:04 haddasbronfman

@legendecas thank you so much for the fix! I'm sorry to bump an old thread here—

@u11d-michal-miler's code example works, but only for non compressed responses. When we have a gzip/deflate/br compressed response, we see gibberish in the span. Here's an example:

const { HttpInstrumentation } = require("@opentelemetry/instrumentation-http");
const { NodeTracerProvider } = require("@opentelemetry/sdk-trace-node");
const { registerInstrumentations } = require("@opentelemetry/instrumentation");
const {
  ConsoleSpanExporter,
  BatchSpanProcessor,
} = require("@opentelemetry/sdk-trace-base");

registerInstrumentations({
  instrumentations: [new HttpInstrumentation({
    responseHook: async (
      span,
      response
    ) => {
      if (!response.headers) return;

      const contentEncoding = response.headers['content-encoding'];
      span.setAttribute('http.response.content_encoding', contentEncoding);

      let body = '';
      response.on('data', (chunk) => {
        body += chunk.toString();
      });
      response.on('end', () => {
        span.setAttribute('http.response.body', body);
        response.removeAllListeners();
      });
    },
  })],
});

const provider = new NodeTracerProvider();
const exporter = new ConsoleSpanExporter();
const processor = new BatchSpanProcessor(exporter);
provider.addSpanProcessor(processor);
provider.register();

const http = require("http");
const axios = require('axios');

const server = http.createServer(async (req, res) => {
  console.log("Request started")
  // No compression
  await axios.get(
    `https://api.github.com/users/defunkt`,
    {
      headers: {
        'Accept-Encoding': 'identity',
        Accept: 'application/vnd.github+json',
      },
    }
  );

  // Gzip compressed response
  await axios.get(
    `https://api.github.com/users/defunkt`,
  );

  res.end("Success");
});

server.listen(3000, () => {
  console.log("Server running at http://localhost:3000");
});

Here we have a server with two requests, one that has a gzip'd response and one that doesn't. Here are the resulting spans:

{
  traceId: 'af939f97f5cc6ed91562669ecf1fa45f',
  parentId: '528bf06e369592df',
  traceState: undefined,
  name: 'GET',
  id: 'aa2ecd5ba2654566',
  kind: 2,
  timestamp: 1688264149851000,
  duration: 95547,
  attributes: {
    'http.url': 'https://api.github.com/users/defunkt',
    'http.method': 'GET',
    'http.target': '/users/defunkt',
    'net.peer.name': 'api.github.com',
    'http.host': 'api.github.com:443',
    'net.peer.ip': '140.82.113.5',
    'net.peer.port': 443,
    'http.status_code': 200,
    'http.status_text': 'OK',
    'http.flavor': '1.1',
    'net.transport': 'ip_tcp',
    'http.response.content_encoding': 'gzip',
    'http.response.body': '\x1F�\b\x00\x00\x00\x00\x00\x00\x03���n�@\x14E\x7F����\x01��1R�E�颕"%m��5�\x18�\x1Df��\x03�E����/�\x13z\x07p�zQ�����}�ǻ�L�R\x19��B\x1EZ�؂���ɂ\x19[�}�f\x1F��m>\x7F���o\x1Fnq_t��۷N�^E������_���6k�t�5$\r]��-O�v�k����A\x19\x07\x172��$F\x0E:��X���\x17�džC�K��jm�`/]�G�?Cp5^+S�\x17\x00�sK�Ĉ`�)<��4��\x00�<��\x15\x04\t��;Y̱3!0s4��s\'\x1B;h��ϝjHY3��? ��+�Q?�l!�\x1E|04��\x00\x00�\x1D6k\x169\x12=o��D~\n' +
      "cp2���L�]�\x10�S#��\x0Fx�a\x8A�^\x14uH�Ah/\x11'Q��w�S��\x17a<9A\x15j���0'��V�\x05�\x10�)W�U\x1Eʏ��aˁh�\x0F3?3�\x16\n" +
      'a\x1C\x05*��4�M��B���ͣ"�6st4\x165m�U�\x1F���v���a\x15�AخΩ@��?��ܼ�I8Y�aЛ0RA�D�v\x19G�$����d�ƻGxh��uM�ZF�e��O�t�M�7���/͗u�\x04\x00\x00'
  },
  status: { code: 0 },
  events: [],
  links: []
}
{
  traceId: '301574b2815191e189fc0e714511029b',
  parentId: '28c2684141a1cf96',
  traceState: undefined,
  name: 'GET',
  id: 'bd6658905463b6de',
  kind: 2,
  timestamp: 1688264149932000,
  duration: 92838,
  attributes: {
    'http.url': 'https://api.github.com/users/defunkt',
    'http.method': 'GET',
    'http.target': '/users/defunkt',
    'net.peer.name': 'api.github.com',
    'http.host': 'api.github.com:443',
    'net.peer.ip': '140.82.112.6',
    'net.peer.port': 443,
    'http.response_content_length': 485,
    'http.status_code': 200,
    'http.status_text': 'OK',
    'http.flavor': '1.1',
    'net.transport': 'ip_tcp',
    'http.response.content_encoding': 'gzip',
    'http.response.body': '\x1F�\b\x00\x00\x00\x00\x00\x00\x03���n�@\x14E\x7F����\x01��1R�E�颕"%m��5�\x18�\x1Df��\x03�E����/�\x13z\x07p�zQ�����}�ǻ�L�R\x19��B\x1EZ�؂���ɂ\x19[�}�f\x1F��m>\x7F���o\x1Fnq_t��۷N�^E������_���6k�t�5$\r]��-O�v�k����A\x19\x07\x172��$F\x0E:��X���\x17�džC�K��jm�`/]�G�?Cp5^+S�\x17\x00�sK�Ĉ`�)<��4��\x00�<��\x15\x04\t��;Y̱3!0s4��s\'\x1B;h��ϝjHY3��? ��+�Q?�l!�\x1E|04��\x00\x00�\x1D6k\x169\x12=o��D~\n' +
      "cp2���L�]�\x10�S#��\x0Fx�a\x8A�^\x14uH�Ah/\x11'Q��w�S��\x17a<9A\x15j���0'��V�\x05�\x10�)W�U\x1Eʏ��aˁh�\x0F3?3�\x16\n" +
      'a\x1C\x05*��4�M��B���ͣ"�6st4\x165m�U�\x1F���v���a\x15�AخΩ@��?��ܼ�I8Y�aЛ0RA�D�v\x19G�$����d�ƻGxh��uM�ZF�e��O�t�M�7���/͗u�\x04\x00\x00'
  },
  status: { code: 0 },
  events: [],
  links: []
}

I tried updating the response hook like so, but it still doesn't work. My guess is that there is a race condition where we are not seeing all the chunks by the time the response closes.

responseHook: async (
      span,
      response
    ) => {
      const encoding = response.headers['content-encoding'];
      span.setAttribute('http.response.content_encoding', encoding);

      let stream = response;

      // Handle different types of encoding.
      if (['gzip', 'deflate', 'br'].includes(encoding)) {
        const decompress = encoding === 'gzip' ? zlib.createGunzip() :
                          encoding === 'deflate' ? zlib.createInflate() :
                          zlib.createBrotliDecompress();
        
        await pipelineAsync(response, decompress);
        stream = decompress;
      }
      const chunks = [];
      for await (const chunk of stream) {
        chunks.push(chunk.toString());
      }

      const body = chunks.join('');
      span.setAttribute('http.response.body', body);
    },
  })],

Here's a reproduction: HTTP response body reproduction

helloworld avatar Jul 02 '23 02:07 helloworld

@helloworld I don't think it is helpful to decompress the server response to collect span attributes. It is computational costing and should be avoided. I'd suggest hooking a server framework instrumentation instead.

legendecas avatar Jul 07 '23 09:07 legendecas

@legendecas do you have any pointers to server framework instrumentation I should be looking at? Do you mean express instrumentation, and such?

helloworld avatar Jul 07 '23 17:07 helloworld

I believe I'm encountering the same issue with try to get the response added as a custom attribute.

The requestHook is working fine for me:

new HttpInstrumentation({
      requestHook: (span, req) => {
        let body = "";
        req.on("data", (chunk) => {
          body += chunk;
        });
        req.on("end", (chunk) => {
          span.setAttribute("http.request.body", body);
        });
      }
      // responseHook: async (span, response) => {
      // console.log(response.req);
      // }
    }),

But trying the same thing for the responseHook, the response callback functions don't execute:

responseHook: (span, res) => {
        console.log("This is run");
        res.on("data", (chunk) => {
          console.log("not run");
        });
        res.on("end", (chunk) => {
          console.log("not run");
        });
      }

emilys314 avatar Nov 20 '23 21:11 emilys314