kit icon indicating copy to clipboard operation
kit copied to clipboard

fix: call worker `unref` instead of `terminate`

Open ignatiusmb opened this issue 1 year ago • 3 comments

According to the docs, terminate seems to effectively stop all execution in the thread, as soon as possible. This seems to break some long running process — which was working with child_process before — when prerendering, either because the ample amounts of prerendered pages/routes, the huge amount of data that needs to be (pre)processed for each routes, or a combination of both.

Substituting terminate with unref would allow the worker to exit when it finds that it's the only active one in the event system, this would allow the long running process that seems to happen after terminate is called to finish without being interrupted.

The error thrown by the terminate method also seems to be very vague and random, trying to await it seems to give a more helpful message along with a stacktrace, but still not enough to figure out what's going on and why it's prematurely exiting the thread. This is mostly an excuse because I couldn't seem to find a way to recreate the issue in a smaller scale and push in a failing test.

Please don't delete this checklist! Before submitting the PR, please make sure you do the following:

  • [ ] It's really useful if your PR references an issue where it is discussed ahead of time. In many cases, features are absent for a reason. For large changes, please create an RFC: https://github.com/sveltejs/rfcs
  • [x] This message body should clearly illustrate what problems it solves.
  • [ ] Ideally, include a test that fails without this PR but passes with it.

Tests

  • [x] Run the tests with pnpm test and lint the project with pnpm lint and pnpm check

Changesets

  • [x] If your PR makes a change that should be noted in one or more packages' changelogs, generate a changeset by running pnpm changeset and following the prompts. Changesets that add features should be minor and those that fix bugs should be patch. Please prefix changeset messages with feat:, fix:, or chore:.

ignatiusmb avatar Jun 08 '23 14:06 ignatiusmb

🦋 Changeset detected

Latest commit: 574c4cd9c7b7341a5f40818aaa8bb42b8483ad37

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
@sveltejs/kit Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

changeset-bot[bot] avatar Jun 08 '23 14:06 changeset-bot[bot]

Hmm. The point of doing it in a worker/subprocess was to kill any straggling processes: https://github.com/sveltejs/kit/pull/10120/files#diff-348357ad608e5d50118974da4744182066b02236abef646543dbd7b40b6b8b27R5

From your description, I'm not sure it sounds like that will still happen?

benmccann avatar Jun 08 '23 22:06 benmccann

Can you point out where in the description that made you unsure so that I can help out clarify them? We do have this code in the test that make sure that everything is shut down correctly, and it does fail if we only remove .terminate() but didn't call .unref().

ignatiusmb avatar Jun 12 '23 08:06 ignatiusmb

Mhm, this is kind of a weird situation, as it seems we don't have equivalent measures to stop the process. From my limited understanding subprocess.kill() does allow it to run some cleanup before terminating, while worker.terminate() is like subprocess.kill('SIGTERM') aka forcefully stopping the worker immediately. There does not seem to be a similar method for workers to tell it "stop as soon as possible, but run cleanup before that".

So yeah, maybe unref is the right approach as it should terminate when the parent process terminates at the latest, but it's not quite the same since the "stop now" signal may come later than with the subprocess. I tend to approve this PR. @Conduitry what's your stance on this?

dummdidumm avatar Jul 04 '23 10:07 dummdidumm

Can you point out where in the description that made you unsure so that I can help out clarify them?

@ignatiusmb sorry for the two month hiatus in responding. The part I was referring to was "this would allow the long running process that seems to happen after terminate is called to finish without being interrupted". The problem I believe we were initially trying to solve was to kill an infinitely running process. If we're not interrupting then I wonder if we're really having the desired effect

benmccann avatar Aug 10 '23 21:08 benmccann

We have a test that ensures child processes don't keep the parent process alive after prerendering is complete: https://github.com/sveltejs/kit/blob/108a6a8f02e41e4346e0b83749f8109217327157/packages/kit/test/prerendering/basics/src/hooks.server.js#L20-L23

As such, this PR seems good to merge to me (though I confess I don't quite understand what difference it would make in practice)

Rich-Harris avatar Dec 13 '23 03:12 Rich-Harris