Fetch failed with UND_ERR_CONNECT_TIMEOUT error on Next.js serverless function on Vercel Production env
TLDR; When executing a fetch request from a serverless function it sometimes fails returning a UND_ERR_CONNECT_TIMEOUT on a Nex.js production environment hosted on Vercel.
I currently have a Next.js site (v14.1.4 - Pages router) running on the Vercel platform (Node v20x) that performs fetch requests to the Slack API from an API route through a serverless function. This is a breakdown of what happens before the error:
- From a client-side component, start a fecth POST request to an API endpoint (route handler) on form submission;
- On an API serverless function realise another fetch POST request to an external API (in my case I used Slack message API - "https://slack.com/api/chat.postMessage");
- On a production environment hosted on Vercel, around 70% of the requests to Slack are working while another 30% fail returning 500 server error code "UND_ERR_CONNECT_TIMEOUT".
Full error message:
Unhandled Rejection: TypeError: fetch failed at node:internal/deps/undici/undici:12345:11 at process.processTicksAndRejections (node:internal/process/task_queues:95:5) { cause: ConnectTimeoutError: Connect Timeout Error at onConnectTimeout (node:internal/deps/undici/undici:7492:28) at node:internal/deps/undici/undici:7448:50 at Immediate._onImmediate (node:internal/deps/undici/undici:7480:13) at process.processImmediate (node:internal/timers:478:21) at process.callbackTrampoline (node:internal/async_hooks:130:17) { code: 'UND_ERR_CONNECT_TIMEOUT' } } Node.js process exited with exit status: 128. The logs above can help with debugging the issue.
IMPORTANT NOTE: I've seen this issue happening in both pages and app routers but more frequently on the pages router. Also, it only happens on production environments hosted on Vercel. I could not (personally) reproduce from my local server nor dev environments.
Reproducible By
This codepen contains an example of the code being used to trigger the fetch requests coming from the input form submission to the serverless function API call. https://codesandbox.io/p/devbox/gifted-shirley-mzlvgy?file=%2Fapp%2Fapi%2Froute.js%3A1%2C1-39%2C1
Expected Behaviour
Currently some external API calls from a serverless function are returning undici unhandled fetch errors. Expected behaviour is no errors being returned and API call succeeding every time.
Environment
Operating System: Vercel Servers
Binaries: Node: v20x (default vercel v20 setting) npm: 10.2.3 yarn: 1.22.19 build command: yarn build
Relevant Packages: next: 14.1.4 react: 18.2.0 react-dom: 18.2.0
Additional Context
Additional information about the issue and more cases can be found at: https://github.com/vercel/next.js/discussions/57384 https://github.com/vercel/next.js/issues/66373
I am also getting this error but for my project it is on build time. And it only happens in vercel, no issues in local. The API times are also not unreasonable i.e. 2 seconds per API call at max. (response time is local). Not sure how I can measure in vercel at build time.
Nextjs version is 13.1.6 and I've tried Node 18x and 20x + node_options
Seems I'm encountering the same thing - no issues locally - timeouts on Vercel.
Running next 14.2.3
Which external APIs are you trying to fetch specifically and which method are you using? @zackproser @kpratik2015
Which external APIs are you trying to fetch specifically and which method are you using? @zackproser @kpratik2015
@andremendonca03 using internal API, no third party API. Our API is GraphQL but it used to work without any issue in past on vercel.
@andremendonca03 I was able to resolve my problem by setting --no-experimental-fetch in NODE_OPTIONS environment variable in vercel. It also helped me to get better error log which required me to increase timeout in next.config.js as staticPageGenerationTimeout: 1000,
I'm having the same issue, only on prod, and only sporadically. Is there a fix?
Similar sporadic issue; only on vercel, and never able to reproduce locally:
Node.js process exited with exit status: 128. The logs above can help with debugging the issue.
Previously happened when my db (supabase) calls a Vercel endpoint, which could take longer than 5s (max timeout on the webhook). My guess was that supabase closed the connection before the serverless function could fully execute.
Now it's happening again, and I'm 100% sure the call is taking less than 5 seconds. We should consider reaching out to support at this point
@andremendonca03 I was able to resolve my problem by setting
--no-experimental-fetchinNODE_OPTIONSenvironment variable in vercel. It also helped me to get better error log which required me to increase timeout innext.config.jsasstaticPageGenerationTimeout: 1000,
@kpratik2015 is you set --no-experimental-fetch do you also need to import node-fetch?
We're having the same problem (calling external API from route handler causes undici timeout) and reached out to support. This just started happening by itself and IMO has something to do with Vercel networking or NextJS node version.
@andremendonca03 I was able to resolve my problem by setting
--no-experimental-fetchinNODE_OPTIONSenvironment variable in vercel. It also helped me to get better error log which required me to increase timeout innext.config.jsasstaticPageGenerationTimeout: 1000,@kpratik2015 is you set
--no-experimental-fetchdo you also need to importnode-fetch?No, I've made no code changes (apart from config change I mentioned) and we continue to simply use fetch with Node v20 set in Vercel.
@PratikKataria-plivo how do you set the env variable on vercel? Is it just a new variable name NODE_OPTIONS and value --no-experimental-fetch?
@andremendonca03 I was able to resolve my problem by setting
--no-experimental-fetchinNODE_OPTIONSenvironment variable in vercel. It also helped me to get better error log which required me to increase timeout innext.config.jsasstaticPageGenerationTimeout: 1000,@kpratik2015 is you set
--no-experimental-fetchdo you also need to importnode-fetch?No, I've made no code changes (apart from config change I mentioned) and we continue to simply use fetch with Node v20 set in Vercel.
@PratikKataria-plivo how do you set the env variable on vercel? Is it just a new variable name
NODE_OPTIONSand value--no-experimental-fetch?
Yup, in project settings -> Environment Variables
I am also getting a bunch of undici errors as well recently. These are the 3 main ones
"next": "^14.2.3", node v20.9.0
I also tried setting vercel env variable: NODE_OPTIONS=--dns-result-order=ipv4first but it has not solved the issue
`TypeError: fetch failed
at node:internal/deps/undici/undici:12618:11
at process.processTicksAndRejections (node:internal/process/task_queues:95:5) {
cause: Error: connect ETIMEDOUT 76.76.21.241:443
at TCPConnectWrap.afterConnect [as oncomplete] (node:net:1555:16)
at TCPConnectWrap.callbackTrampoline (node:internal/async_hooks:128:17) {
errno: -110,
code: 'ETIMEDOUT',
syscall: 'connect',
address: '76.76.21.241',
port: 443
}`
`TypeError: fetch failed
at node:internal/deps/undici/undici:12618:11
at process.processTicksAndRejections (node:internal/process/task_queues:95:5) {
cause: ConnectTimeoutError: Connect Timeout Error
at onConnectTimeout (node:internal/deps/undici/undici:7760:28)
at node:internal/deps/undici/undici:7716:50
at Immediate._onImmediate (node:internal/deps/undici/undici:7748:13)
at process.processImmediate (node:internal/timers:476:21)
at process.callbackTrampoline (node:internal/async_hooks:128:17) {
code: 'UND_ERR_CONNECT_TIMEOUT'
}
}`
`TypeError: fetch failed
at node:internal/deps/undici/undici:12618:11
at process.processTicksAndRejections (node:internal/process/task_queues:95:5) {
cause: [Error: C0AFB780CE7F0000:error:0A00010B:SSL routines:ssl3_get_record:wrong version number:ssl/record/ssl3_record.c:355:
] {
library: 'SSL routines',
reason: 'wrong version number',
code: 'ERR_SSL_WRONG_VERSION_NUMBER'
}
}`
I got a response from Vercel support, here's an excerpt:
Looking at your Runtime logs, I see both Edge Functions and Serverless Functions experience 504 issues.
Without going into details, I can confirm the two runtimes use different providers, so it's unlikely to be a Vercel platform issue. I also couldn't find similar reports from other customers, which would indicate that the issue may be with your backend.
Looking at your
(redacted)Serverless Function, over the past 24 hours, there were over 1000 successful invocations and 94 timeouts.
Since the issue is intermittent and this function typically resolves within 1 second, you may by able to work around the issue by implementing a retry strategy where you abort and retry the POST request if a response from your backend hasn't been received in 2 seconds.
You can also try to implement the different workarounds suggested in the Github issues below:
https://github.com/vercel/vercel/issues/11692#issuecomment-2152859828 https://github.com/vercel/next.js/issues/66373#issuecomment-2148546390
They are basically looping back into this thread :) I tried both flags: --no-experimental-fetch and --dns-result-order=ipv4first but with the first the build fails and second doesn't seem to do anything.
They are basically looping back into this thread :) I tried both flags:
--no-experimental-fetchand--dns-result-order=ipv4firstbut with the first the build fails and second doesn't seem to do anything.
I have had the same result with both of those flags
What build error are you guys getting with --no-experimental-fetch ? Because only after this flag I was able to get sensible error which pointed me to https://nextjs.org/docs/messages/page-data-collection-timeout which mentions: Increase the timeout by changing the config.staticPageGenerationTimeout configuration option (default 60 in seconds).
I'm getting a build error: headers not found
On Mon, 10 Jun 2024, 20:29 Pratik Kataria, @.***> wrote:
What build error are you guys getting with --no-experimental-fetch ? Because only after this flag I was able to get sensible error which pointed me to https://nextjs.org/docs/messages/page-data-collection-timeout which mentions: Increase the timeout by changing the config.staticPageGenerationTimeout configuration option (default 60 in seconds).
image.png (view on web) https://github.com/vercel/vercel/assets/14140930/a0622f5a-741e-44a1-b276-2e4becb94738
— Reply to this email directly, view it on GitHub https://github.com/vercel/vercel/issues/11692#issuecomment-2159136025, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAM44FNNPE6V3O5GNZWSJKLZGX5CVAVCNFSM6AAAAABIY4K6QCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNJZGEZTMMBSGU . You are receiving this because you are subscribed to this thread.Message ID: @.***>
I'm getting a build error: headers not found
Same here:
I wonder if this may be because we're on older Next/Node version? Running [email protected] and node 18.x
This is what happens when I compile with --no-experimental-fetch
[08:40:08.082] unhandledRejection ReferenceError: Headers is not defined
[08:40:08.083] at Object.<anonymous> (/vercel/path0/node_modules/next/dist/server/web/spec-extension/adapters/headers.js:32:30)
[08:40:08.083] at Module._compile (node:internal/modules/cjs/loader:1369:14)
[08:40:08.083] at Module._extensions..js (node:internal/modules/cjs/loader:1427:10)
[08:40:08.083] at Module.load (node:internal/modules/cjs/loader:1206:32)
[08:40:08.083] at Module._load (node:internal/modules/cjs/loader:1022:12)
[08:40:08.083] at Module.require (node:internal/modules/cjs/loader:1231:19)
[08:40:08.083] at mod.require (/vercel/path0/node_modules/next/dist/server/require-hook.js:65:28)
[08:40:08.083] at require (node:internal/modules/helpers:179:18)
[08:40:08.083] at Object.<anonymous> (/vercel/path0/node_modules/next/dist/server/api-utils/index.js:67:18)
[08:40:08.083] at Module._compile (node:internal/modules/cjs/loader:1369:14)
[08:40:08.103] Error: Command "npm run build" exited with 1
[08:40:08.791]
I am using Next.js version: 14.2.3 and Node 20.x is enabled in vercel
Mine was Nextjs 13.x and I set Nodejs 20 in Vercel settings. So try Node v20. Similar I found here https://stackoverflow.com/questions/77594693/unhandledrejection-referenceerror-headers-is-not-defined-building-next-js-14 Or check node-fetch doc for clues https://stackoverflow.com/a/65766305
Since the issue is intermittent and this function typically resolves within 1 second, you may by able to work around the issue by implementing a retry strategy where you abort and retry the POST request if a response from your backend hasn't been received in 2 seconds.
@rafalzawadzki Do you have a reference snippet to abort and retry the fetch POST after 2 seconds? This is not a solution, the request can still fails twice but can be a prevention to reduce the number of errors.
Since the issue is intermittent and this function typically resolves within 1 second, you may by able to work around the issue by implementing a retry strategy where you abort and retry the POST request if a response from your backend hasn't been received in 2 seconds.
@rafalzawadzki Do you have a reference snippet to abort and retry the fetch POST after 2 seconds? This is not a solution, the request can still fails twice but can be a prevention to reduce the number of errors.
the link provided by Vercel: https://developer.mozilla.org/en-US/docs/Web/API/AbortController/abort (it's a standard Web API feature)
@kpratik2015 however hard I try I can't seem to be able to use --no-experimental-fetch flag - it just results in all sorts of build errors. I tried with different node and next versions, in prod and locally - no go.
sorry to doubt you, but have you made sure to re-deploy your project after adding this env variable to make sure it's used?
also not sure if it's a lead at all, but apparently AWS Lambda recently rolled out an upgrade to Node version with a breaking undici change that causes bugs in Netlify, Vercel, Sentry etc: https://github.com/nodejs/node/issues/53186. I wonder if that may have something to do with our issue
Hi, we're looking at the issue but don't have any updates just yet. Any minimal reproduction that doesn't depend on hitting an API behind authentication could help a lot 🙏
https://github.com/nodejs/node/issues/53186 is very likely unrelated to this issue, since it's only about Node.js 20 while this current discussion is affecting both Node.js 18 and Node.js 20.
@rafalzawadzki Ya I did. It's pretty old project and we only use GraphQL API so maybe I am not running into problems with other variations of fetch usage. Also, each API response in successful builds show max. response time as under 1 second. Initially I tried all possible values of --dns-result-order but that didn't help. And locally everything works without any change. Only in Vercel environment it was failing everytime.
If it helps, this is the only way we are using fetch in getStaticProps:
type StaticFetchParams<V> = {
query: string;
variables?: V;
};
export const staticFetch = async <V, TData>(
params: StaticFetchParams<V>,
headers?: HeadersInit
) => {
const response = await fetch(process.env.NEXT_PUBLIC_REACT_APP_API, {
method: "POST",
headers: headers || {
"Content-Type": "application/json",
},
body: JSON.stringify(params),
});
return response.json();
};
So this issue comes and goes for us.... It started up again last week, which seems to match this thread. Here is a comparison of the number of UND_ERR_CONNECT_TIMEOUT's we've gotten vs the general activity of our platform.
This is in production, on Vercel. These seem to mostly be internal calls, like GET /api/auth/session as well as webhooks.
I'm experiencing this same issue on a brand new app that I imported into Vercel this morning. It fails to make a fetch to the QStash API (which is authenticated) in order to push a task. There was about 30 minutes where it started working. But outside of that, has not worked. The fetch is being made from a server component.
Next 14.2.3 Node 20.x
Experienced same issue. Used webhook, which times out in 1s. In my case when the wh closes the connection, I get the error.
Some updates:
- We landed a network improvement for builds to mitigate timeout errors at build time
- We believe that the issue at runtime is caused by a race condition regarding keep-alive settings in Undici, the library used by Node.js to provide the
fetch()method. You can add aconnection: 'close'header to yourfetch()calls to disable keep-alive, which should help mitigating this issue
I successfully resolved the issue by configuring the undici global dispatcher in the root layout.
My application is using axios to send requests. To my knowledge, axios does not send a keepAlive so I'm not sure the recommendation helps. In our vercel logs we are seeing:
We are on Node 18.
Unhandled Rejection: TypeError: fetch failed
at node:internal/deps/undici/undici:12618:11
at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
at async h.send (/var/task/apps/frontend/.next/server/chunks/1542.js:1:73006) {
cause: ConnectTimeoutError: Connect Timeout Error
at onConnectTimeout (node:internal/deps/undici/undici:7760:28)
at node:internal/deps/undici/undici:7716:50
at Immediate._onImmediate (node:internal/deps/undici/undici:7748:13)
at process.processImmediate (node:internal/timers:476:21)
at process.topLevelDomainCallback (node:domain:161:15)
at process.callbackTrampoline (node:internal/async_hooks:126:24) {
code: 'UND_ERR_CONNECT_TIMEOUT'
}
}
Node.js process exited with exit status: 128. The logs above can help with debugging the issue.
I'm getting more and more of these errors. Also using axios for requests. Can someone at vercel please prioritize this?
On Wed, 19 Jun 2024 at 22:09, Abhi Aiyer @.***> wrote:
My application is using axios to send requests. To my knowledge, axios does not send a keepAlive so I'm not sure the recommendation helps. In our vercel logs we are seeing:
We are on Node 18.
Unhandled Rejection: TypeError: fetch failed at node:internal/deps/undici/undici:12618:11 at process.processTicksAndRejections (node:internal/process/task_queues:95:5) at async h.send (/var/task/apps/frontend/.next/server/chunks/1542.js:1:73006) { cause: ConnectTimeoutError: Connect Timeout Error at onConnectTimeout (node:internal/deps/undici/undici:7760:28) at node:internal/deps/undici/undici:7716:50 at Immediate._onImmediate (node:internal/deps/undici/undici:7748:13) at process.processImmediate (node:internal/timers:476:21) at process.topLevelDomainCallback (node:domain:161:15) at process.callbackTrampoline (node:internal/async_hooks:126:24) { code: 'UND_ERR_CONNECT_TIMEOUT' } } Node.js process exited with exit status: 128. The logs above can help with debugging the issue.
— Reply to this email directly, view it on GitHub https://github.com/vercel/vercel/issues/11692#issuecomment-2179458591, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAM44FMY4O5IDWCJY2RLTKDZIHXQDAVCNFSM6AAAAABIY4K6QCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNZZGQ2TQNJZGE . You are receiving this because you are subscribed to this thread.Message ID: @.***>