opentelemetry-sdk-workers
opentelemetry-sdk-workers copied to clipboard
`Error [OTLPExporterError]: Unknown error TypeError: fetch failed`
Steps to reproduce
See https://stackblitz.com/edit/cloudflare-templates-gdw8ku?file=src%2Findex.ts
Given
const otelSdk = new WorkersSDK(request, ctx, {
service: 'some-service-name',
endpoint: 'https://httpbin.org/status/200',
});
try {
console.log('fetching...');
const upstreamResponse = await otelSdk.fetch('https://httpbin.org/uuid');
console.log('fetched. now sending');
return otelSdk.sendResponse(upstreamResponse);
} catch (ex) {
console.log('ERROR fetching', ex);
otelSdk.captureException(ex);
}
logs
fetching...
fetched. now sending
Failed to flush spans: [
Error [OTLPExporterError]: Unknown error TypeError: fetch failed
at new OTLPExporterError2 (/tmp/tmp-7-WtdYhEgYO60S/index.js:3735:24)
at eval (/tmp/tmp-7-WtdYhEgYO60S/index.js:4050:13) {
data: undefined,
code: undefined
}
]
but with this change there are no errors logged.
- endpoint: 'https://httpbin.org/status/200';
+ endpoint: 'https://httpbin.org/anything',
I'm not sure what the difference is between those two URLs. In both cases, response.ok
is true
.
I initially ran into this with an endpoint that was returning a 400, but then found this case which seems to trigger it even on a success.
It seems like there may be an error in the error handling/reporting itself. Perhaps an import isn't working as expected and a function's not available?
I'm not sure why either to be honest. We're doing a pretty bog standard fetch: https://github.com/RichiCoder1/opentelemetry-sdk-workers/blob/main/packages/opentelemetry-sdk-workers/src/exporters/OTLPCloudflareExporterBase.ts#L103.
I could try logging out more details about the specific error to maybe surface the specific issue.
Yeah, it seemed straightforward. I think I'm misinterpreting the error message so I think this is more feedback than bug report.
I added some logging to the bundled worker in the Quick Edit panel. Here's the success case
data:image/s3,"s3://crabby-images/faee3/faee35501e19f20b9f94142424f51424a0af11c1" alt="Screen Shot 2022-11-22 at 4 37 53 PM"
and the error case
data:image/s3,"s3://crabby-images/bcb98/bcb9881cb7c6d3d255992d04a6a314079c6c3828" alt="Screen Shot 2022-11-22 at 4 39 12 PM"
The error is from the POST
to https://httpbin.org/status/200/v1/traces
which 404
s. https://httpbin.org/anything/v1/traces
is valid.
The "unknown error" and TypeError
threw me off. Perhaps there's a clearer way to state that the request to the OLTP endpoint failed. Perhaps I (also) was being a bit dense.
It's not a full fix since fetch is kind of opaque, but I'm updating the error message to at least hint it might be an export url issue.
Not directly related, but for those encountering this error, couple things to note while debugging:
- if hosting your own collector endpoint, you can’t target the IP address directly, you need to add a DNS A record
- If you host your own collector endpoint and accessing via non-standard ports outside your zone, the request will not go through. I had to remap the port to expose on 80/443
@Schachte I struggled to debug the port issue on a different project as well, but it is documented at https://developers.cloudflare.com/workers/platform/known-issues/#custom-ports
@Schachte I struggled to debug the port issue on a different project as well, but it is documented at developers.cloudflare.com/workers/platform/known-issues/#custom-ports
Wow, I actually did not know that! I'll see about adding that to the readme, that seems easy to trip over.
Yeah, here's where I joined the group who have been surprised by this https://github.com/cloudflare/cloudflare-docs/issues/4299#issuecomment-1118585554.
At least it's documented now 🙃 . I believe wrangler
also logs an error now