workerd icon indicating copy to clipboard operation
workerd copied to clipboard

Improve "The script will never generate a response." debuggability

Open hansottowirtz opened this issue 1 year ago • 20 comments

The The script will never generate a response. error is generally a very hard error to debug, the only source on the internet being this blog post.

We recently had this issue thrown on rare occasions with the following code:

import { createClient } from "@urql/core"

const client = createClient({ url: 'https://api.example.com/graphql' })

export default {
  fetch(request) {
    const response = client.query(someQuery);
    return Response(JSON.stringify(response), { headers: { "content-type": "application/json" })
  }
}

When heavily requested, the worker sometimes aborted with The script will never generate a response.. Luckily, this was the only piece of code that could possibly store promises in a global context (To solve it we called createClient on a per-request basis), but I can imagine this would be very hard to debug in a more complicated worker that isn't so careful about saving promises in a global context.

Is there any way to make this error more debuggable?

hansottowirtz avatar Dec 06 '22 20:12 hansottowirtz

In https://github.com/Urigo/graphql-mesh/issues/4201, the issue is even harder to find, because the library is internally using a global variable.

hansottowirtz avatar Dec 07 '22 11:12 hansottowirtz

Eventually we want to solve this kind of problem by having every event run in a separate global scope, but that will require some deep changes to V8 to be able to do in a way that performs well, and we don't know exactly how soon we'll be able to build it. Before we do that, we'll introduce alternative API that people can use for explicit cross-event caching when desired.

In the shorter term, this is pretty hard for us to solve, because we don't really have enough visibility into the V8 microtask loop to know that the request has decided to wait on a promise that originates from another concurrent request. All we know is that the request has no further I/O or timers scheduled yet hasn't returned a result, therefore it seems permanently hung.

Maybe when we implement AsyncLocalContext-like support, we could leverage that? Not sure.

kentonv avatar Dec 09 '22 19:12 kentonv

Is there any progress on this? I'm get this error consistently starting with this commit.

iamwacko avatar May 04 '23 03:05 iamwacko

For reference, I think the following errors fall under the same category:

Cannot perform I/O on behalf of a different request. I/O objects (such as streams, request/response bodies, and others) created in the context of one request handler cannot be accessed from a different request's handler. This is a limitation of Cloudflare Workers which allows us to improve overall performance.

(there's a stack trace, at least making it possible to debug)

and

Promise will never complete.

hansottowirtz avatar May 08 '23 20:05 hansottowirtz

I'm here just to add I'm struggling with this error as well. I do try to cache async data between events, as I expected this to be totally possible. For what I understand the worker code tracks for every Promise which event the promise belongs to. Resolving events from one request in another request is not supported. This should be mentioned in the documentation imho.

ssured avatar Jun 09 '23 13:06 ssured

We're having this problem too. I have an endpoint which outputs data from my DB (via Hyperdrive). But if I call the same endpoint very soon after the first call, I get this error. The error is plainly false since it says the script "will never generate a response", even though it did the first time, just not the second.

PANstudio avatar Dec 09 '23 15:12 PANstudio

@PANstudio in the main "fetch" event listener, have a look at FetchEvent.waitUntil. Make sure all side effects that you need to run and complete are registered in the waitUntil listener, otherwise all pending promises will (silently) be aborted.

ssured avatar Dec 14 '23 13:12 ssured

Anyone know how to solve this issue when using web sockets? Here is an example https://github.com/grgbkr/cf-websocket-msg-limit

PizzaConsole avatar Dec 27 '23 17:12 PizzaConsole

Per @kentonv's earlier response, changing this behavior is difficult as we do not currently have the visibility into the microtask queue to know which promise is actually being waited on to generate the response, not to mention knowing whether there is anything still around that can resolve that promise. We do have a mechanism in place now to allow waiting on promises across requests but it's still entirely possible to store a promise at global scope that will never get resolved. Also consider that this really isn't tied to the cross-request waiting issue. Consider the following example:

export default {
  fetch(req) {
    const { promise, resolve } = Promise.withResolvers();
    return promise;
  }
}

In this case we should also get the The script will never generate a response error because the promise that we are returning for the response will never be resolved.

When the call to fetch returns back to C++ we will make an attempt to drain the promise microtask queue. That queue will only contain tasks for promises that are resolved/rejected. If by the time we are done draining the queue we still have outstanding promises that have not been resolved, those are going to end up being canceled and dropped on the floor because there is simply no way for us to drive those forward.

I guess my key question back to those of you that are dealing with debugging this issue is: What information could we provide that would make this easier to debug?

Is it knowing which promises are being waited on for actually generating the response? If that's the case, it's not trivial because a single request can potentially generate thousands of promises, any one of which could end up being awaited... and by the time it is awaited we've lost the context of where the promise was created or what created it. v8 provides a "promise hooks" API that would allow us to track all promises created but it comes with a fairly steep performance penalty that would mean we couldn't keep it always on.

jasnell avatar Jan 03 '24 00:01 jasnell

@jasnell what if we had a special mode where we collect stacktrace for every instantiated promise and store it in the promise property? We could theoretically follow the promise chain back from c++ to the promise object and dump the stack. Of course performance would be horrible, but it is better than nothing?

wdyt?

mikea avatar Jan 03 '24 15:01 mikea

@PoisnFang Your specific case is a bug which I believe will be improved by #804. However, note that the bug is only that the error message is wrong -- the error message should be saying that the message was too large. We do in fact have a 1MB limit on WebSocket messages received by a Worker.

kentonv avatar Jan 03 '24 17:01 kentonv

@PoisnFang Your specific case is a bug which I believe will be improved by #804. However, note that the bug is only that the error message is wrong -- the error message should be saying that the message was too large. We do in fact have a 1MB limit on WebSocket messages received by a Worker.

@kentonv Sorry, no, that's not what my use case is. (Mine has nothing to do with the size of the message) The case is very simple. If you use a Websocket at all within the worker then it will always throw the "The script will never generate a response."

PizzaConsole avatar Jan 03 '24 17:01 PizzaConsole

@mikea ...

what if we had a special mode where we collect stacktrace for every instantiated promise...

Definitely possible using the v8 promise hooks API but absolutely only with a significant perf cost. If we went with that mode I think we'd absolutely need to strictly limit it to explicit opt-in in workerd and process sandbox in production. I'd rather prefer to avoid this approach, however. I've done this with Node.js using the promise hooks API there and it's just gnarly.

jasnell avatar Jan 03 '24 17:01 jasnell

@PoisnFang The example you linked to fails with this error specifically when it receives a message >1MB. If you are always seeing the error regardless of message size, please post your code that demonstrates this.

kentonv avatar Jan 03 '24 17:01 kentonv

We do have a mechanism in place now to allow waiting on promises across requests

@jasnell Is there anything we need to do to "opt in" to this functionality? As mentioned above by several others, we try to do cross-request caching of expensive async calls, but we have been seeing odd spikes in The script will never generate a response errors that I suspect are related to our use of this.

jmaroeder avatar Feb 29 '24 20:02 jmaroeder

There's nothing you need to opt into but it still won't completely solve every issue with this.

For example, take the following contrived example:

export default {
  async fetch() {
    if (globalThis.promise === undefined) {
      const { promise, resolver } = Promise.withResolvers();
      globalThis.promise = promise;
    } else {
      await globalThis.promise;
    }
    return new Response("hello");
  }
}

You will end up with an error in the second request because there's nothing remaining that is going to actually resolve the promise that is being waited on.

Let me be clear: we do not recommend that you actually wait on promises across requests, and in the future we might end up making it impossible to share promises across global state like this, but if you do, then you need to make sure that there is something still remaining to drive the promises to completion. Without seeing your code I cannot say for certain what the issue may be but you'll want to double check to make sure all of your promises are being resolved.

jasnell avatar Feb 29 '24 20:02 jasnell

Ok, just a status update... did some digging and improving this is definitely going to be a challenge...

Take the following example:

export default {
  async function fetch() {
    const { promise } = Promise.withResolver();
    await promise;
    return new Response("ok");
  }
};

In this example, the promise that is not being resolved leading to the "The script will never generate a response" error/warning is obviously the await promise we create but are never resolving. However, the actual warning here is triggered by the outer promise representing the call to the async fetch(...) handler. THAT is the promise that the internal i/o handler is waiting on. Unfortunately, by the time we cancel things and emit this warning, we have no visibility into what other promises are stopping that outer promise from being resolved. JavaScript and V8 just don't give us the tools necessary to easily know what it is we're waiting on -- could be i/o, could be a manually resolved promise like in the example, could be a promise from another request, could be a timer that got canceled, could be lots of things that would supposed to be resolving that await promise that we just don't have visibility on.

We could implement a mechanism with the appropriate book keeping that might provide the information we need but it would depend on the Oh So Slow V8 Promise Hooks API and wouldn't necessarily be guaranteed to work in every case... simply because what we really need to know to make this more debuggable is to know where the promise is supposed to be resolved or rejected, not where it was created or waited on. The closest the promise hook API would allow us to determine is the stack where a promise was created and the stack where it was waited on... and even then we don't have visibility to selectively capture that information only for specific promises so we'd have to capture it for all of them... which is sllloooowwww.

Anyway, I haven't given up. I still want to try to find some way of improving this, but so far it's proving to be quite difficult.

jasnell avatar Mar 20 '24 19:03 jasnell

@jasnell FWIW in any other runtime, your example would simply hang. Which is also difficult to debug. But I guess V8 hasn't seen fit to develop a way to figure out which promise is causing the hang.

The Workers Runtime is actually being a little bit nicer than other runtimes here, by at least failing early, rather than make the user sit and wait for a while before they conclude the request is hung...

kentonv avatar Mar 22 '24 19:03 kentonv