opentelemetry-sdk-workers icon indicating copy to clipboard operation
opentelemetry-sdk-workers copied to clipboard

`Error [OTLPExporterError]: Unknown error TypeError: fetch failed`

Open jfsiii opened this issue 2 years ago • 7 comments

Steps to reproduce

See https://stackblitz.com/edit/cloudflare-templates-gdw8ku?file=src%2Findex.ts

Given

    const otelSdk = new WorkersSDK(request, ctx, {
      service: 'some-service-name',
      endpoint: 'https://httpbin.org/status/200',
    });

    try {
      console.log('fetching...');
      const upstreamResponse = await otelSdk.fetch('https://httpbin.org/uuid');
      console.log('fetched. now sending');
      return otelSdk.sendResponse(upstreamResponse);
    } catch (ex) {
      console.log('ERROR fetching', ex);
      otelSdk.captureException(ex);
    }

logs

fetching...
fetched. now sending
Failed to flush spans: [
  Error [OTLPExporterError]: Unknown error TypeError: fetch failed
      at new OTLPExporterError2 (/tmp/tmp-7-WtdYhEgYO60S/index.js:3735:24)
      at eval (/tmp/tmp-7-WtdYhEgYO60S/index.js:4050:13) {
    data: undefined,
    code: undefined
  }
]

but with this change there are no errors logged.

- endpoint: 'https://httpbin.org/status/200';
+ endpoint: 'https://httpbin.org/anything',

I'm not sure what the difference is between those two URLs. In both cases, response.ok is true.

I initially ran into this with an endpoint that was returning a 400, but then found this case which seems to trigger it even on a success.

It seems like there may be an error in the error handling/reporting itself. Perhaps an import isn't working as expected and a function's not available?

jfsiii avatar Nov 21 '22 20:11 jfsiii

I'm not sure why either to be honest. We're doing a pretty bog standard fetch: https://github.com/RichiCoder1/opentelemetry-sdk-workers/blob/main/packages/opentelemetry-sdk-workers/src/exporters/OTLPCloudflareExporterBase.ts#L103.

I could try logging out more details about the specific error to maybe surface the specific issue.

RichiCoder1 avatar Nov 21 '22 20:11 RichiCoder1

Yeah, it seemed straightforward. I think I'm misinterpreting the error message so I think this is more feedback than bug report.

I added some logging to the bundled worker in the Quick Edit panel. Here's the success case

Screen Shot 2022-11-22 at 4 37 53 PM

and the error case

Screen Shot 2022-11-22 at 4 39 12 PM

The error is from the POST to https://httpbin.org/status/200/v1/traces which 404s. https://httpbin.org/anything/v1/traces is valid.

The "unknown error" and TypeError threw me off. Perhaps there's a clearer way to state that the request to the OLTP endpoint failed. Perhaps I (also) was being a bit dense.

jfsiii avatar Nov 22 '22 21:11 jfsiii

It's not a full fix since fetch is kind of opaque, but I'm updating the error message to at least hint it might be an export url issue.

RichiCoder1 avatar Feb 06 '23 22:02 RichiCoder1

Not directly related, but for those encountering this error, couple things to note while debugging:

  • if hosting your own collector endpoint, you can’t target the IP address directly, you need to add a DNS A record
  • If you host your own collector endpoint and accessing via non-standard ports outside your zone, the request will not go through. I had to remap the port to expose on 80/443

Schachte avatar Mar 27 '23 06:03 Schachte

@Schachte I struggled to debug the port issue on a different project as well, but it is documented at https://developers.cloudflare.com/workers/platform/known-issues/#custom-ports

jfsiii avatar Mar 27 '23 11:03 jfsiii

@Schachte I struggled to debug the port issue on a different project as well, but it is documented at developers.cloudflare.com/workers/platform/known-issues/#custom-ports

Wow, I actually did not know that! I'll see about adding that to the readme, that seems easy to trip over.

RichiCoder1 avatar Mar 27 '23 13:03 RichiCoder1

Yeah, here's where I joined the group who have been surprised by this https://github.com/cloudflare/cloudflare-docs/issues/4299#issuecomment-1118585554.

At least it's documented now 🙃 . I believe wrangler also logs an error now

jfsiii avatar Mar 27 '23 14:03 jfsiii