node icon indicating copy to clipboard operation
node copied to clipboard

implement module.registerHooks() to run synchronous module customization hooks in thread

Open joyeecheung opened this issue 1 year ago β€’ 14 comments

This implements part of the proposal in https://github.com/nodejs/loaders/pull/198 - for the motivations, see https://github.com/nodejs/node/issues/52219 - this fills in the gap where being able to run in-thread and support require() is more important for a hook than being able to run asynchronous code (even then, packages like https://www.npmjs.com/package/everysync allows user to sync-ify async code off-thread themselves), especially for a large amount of CJS loader moneky-patchers out there to migrate to an officially supported API easily.

Screenshot 2024-11-19 at 16 01 47

The synchronous, in-thread hooks are also a lot easier to support universally in the loaders and reduce the number of dark corners where the hooks cannot run or behave in a surprising manner - I guess the amount of "this caveat of asynchronous hooks does not apply to the synchronous hooks" added in the documentation kind of also already proves the point...

For now only resolve and load are implemented to reach parity with module.register(). I left the exports hook to https://github.com/joyeecheung/node/tree/export-hooks which will be a follow-up when we decide the timing at which the post-load hook for CJS should run.

The commit module: use synchronous hooks for preparsing in import(cjs) isn't strictly necessary and could split into a follow-up instead, but I think it's nice to have it here instead of leaving out that corner uncovered.

joyeecheung avatar Nov 02 '24 22:11 joyeecheung

Review requested:

  • [ ] @nodejs/loaders
  • [ ] @nodejs/startup

nodejs-github-bot avatar Nov 02 '24 22:11 nodejs-github-bot

Added a lot of tests. This should now be ready for review.

joyeecheung avatar Nov 19 '24 15:11 joyeecheung

CI: https://ci.nodejs.org/job/node-test-pull-request/63632/

nodejs-github-bot avatar Nov 19 '24 15:11 nodejs-github-bot

CI: https://ci.nodejs.org/job/node-test-pull-request/63633/

nodejs-github-bot avatar Nov 19 '24 15:11 nodejs-github-bot

Codecov Report

Attention: Patch coverage is 98.37278% with 11 lines in your changes missing coverage. Please review.

Project coverage is 88.54%. Comparing base (dbfcbe3) to head (9529784). Report is 94 commits behind head on main.

Files with missing lines Patch % Lines
lib/internal/modules/cjs/loader.js 97.24% 6 Missing :warning:
lib/internal/modules/esm/loader.js 94.44% 3 Missing :warning:
lib/internal/modules/esm/translators.js 94.11% 2 Missing :warning:
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #55698      +/-   ##
==========================================
+ Coverage   88.50%   88.54%   +0.03%     
==========================================
  Files         656      657       +1     
  Lines      189261   189858     +597     
  Branches    36346    36451     +105     
==========================================
+ Hits       167508   168101     +593     
- Misses      14969    14976       +7     
+ Partials     6784     6781       -3     
Files with missing lines Coverage Ξ”
lib/internal/modules/customization_hooks.js 100.00% <100.00%> (ΓΈ)
lib/internal/modules/esm/module_job.js 99.25% <100.00%> (+<0.01%) :arrow_up:
lib/internal/modules/helpers.js 98.84% <100.00%> (-0.01%) :arrow_down:
lib/internal/modules/esm/translators.js 92.93% <94.11%> (-0.17%) :arrow_down:
lib/internal/modules/esm/loader.js 97.80% <94.44%> (-0.23%) :arrow_down:
lib/internal/modules/cjs/loader.js 98.10% <97.24%> (-0.19%) :arrow_down:

... and 32 files with indirect coverage changes

codecov[bot] avatar Nov 19 '24 16:11 codecov[bot]

I do have one feature request I'd love to see here though - a one-way function to disable hooks for security module.lockHooks(), with no arguments, that disallows further register or registerHooks calls.

That seems out of scope of this PR? If it was fine not having it for module.register, why does it need to happen in this PR?

joyeecheung avatar Nov 19 '24 19:11 joyeecheung

Having a hook lockdown phase is not a requirement for this PR by any means, but would be a nice to have to clarify the early hook registration model sooner rather than later.

guybedford avatar Nov 19 '24 19:11 guybedford

Having a hook lockdown phase is not a requirement for this PR by any means, but would be a nice to have to clarify the early hook registration model sooner rather than later.

I feel that should be something that's up to the discretion of the hook author, there can surely be use cases where one wants to avoid inheritance in third-party child processes/workers, or just to avoid infinite recursion in case the code run in the loader forks itself, though we could mention it in the documentation for sure.

joyeecheung avatar Nov 19 '24 19:11 joyeecheung

Right, so it's our job to educate hook authors how to write hooks to support multi threading (or not) and how to avoid recursive application etc. etc.

With regards to hook "lockdown" I think this is something that the application owner would care about - how do I know that an arbitrary npm package import isn't going to be doing hooks without me knowing about it? How do I know I'm not depending on a package that's automatically adding a "jsx" hook and changing the way my local code runs? A lockdown feature is for the application code owner's benefit. Ideally lockdown by default would be great too to ensure we don't enter a Node.js future where hooking the loader is common for library dependencies.

guybedford avatar Nov 19 '24 19:11 guybedford

Right, so it's our job to educate hook authors how to write hooks to support multi threading (or not) and how to avoid recursive application etc. etc.

While I agree better documentation should exist, I think we shouldn't really dive into too much details in the API documentation. It should either be in the learn section in the website, or just have dedicated repository for tutorials + examples like https://github.com/nodejs/node-addon-examples and the new https://github.com/nodejs/package-examples that we are spinning (I also left comment in this PR that I think the example loaders should be moved in its own repo, and refrained from adding too many complementary examples for the new API, I also noticed that some existing examples in the documentation didn't even work, but it felt out of scope to fix them here and IMO they should really just be tested properly in a dedicated repo).

joyeecheung avatar Nov 19 '24 19:11 joyeecheung

@joyeecheung to give an example, I was not aware --import applied for all threads and this came as a surprise to me personally. This information is valuable.

guybedford avatar Nov 19 '24 21:11 guybedford

Yes, though not all valuable information belongs in the API documentation ;) One can argue that https://nodejs.github.io/node-addon-examples/ is full of valuable information, but it's better to have a dedicated tutorial on a complex topic like this, instead of trying to elaborate on it in the API documentation. I think we can make do with one or two sentence about --import for this PR, and leave detailed explanation as a TODO, and don't pile even more information that should be better hosted elsewhere into the API documentation. If you take a look at https://nodejs.org/api/module.html you probably will also notice how tiring and unapproachable it is to have all that information piled into one page, people might never notice all the scattered details and recommendations simply because it's not a great read and kinda odd to have a tutorial taking 1/2 of the page, with the rest just being straightforward API documentation or just full of pointers to other docs instead of examples.

joyeecheung avatar Nov 20 '24 01:11 joyeecheung

I tried moving the docs of sync hooks first, but it turned out to be huge amount of work, because most of the docs shared for both starts with an example of the async one and the text & example now all have to be rewritten :( The doc overhaul is way too much work for a initial semver-minor PR that's just trying to add a variant that's less broken...(that, and also I don't think most of the original docs belong in the API documentation in the first place, so I feel like I am just wasting more effort on texts in a document where they don't belong).

joyeecheung avatar Dec 03 '24 11:12 joyeecheung

I think I've addressed most of the comments so far, except the one about changing the documentation to make sync hooks go first, because I think that would lead to an overhaul of the documentation as most of the explanatory texts are talking about async hooks examples and this PR mostly just append something saying what's different in sync hooks and that is not a huge amount of changes. But actually talking about sync hooks first means all of the texts and examples need to be reworded otherwise it looks incorrect logically. Also this is just an API that's not yet released and I feel that we could do the reordering when we have it more battle tested.

Other than the nit about doc ordering I think this is good to go - incidentally in the past 24 hours I've already seem two glitches (1, 2) coming from incorrect CJS loader monkey patching after require(esm) is unflagged on v22. So I hope we can get it over with sooner than later and stop the spread of incorrect CJS loader monkey patching in the ecosystem with this new proper API.

PTAL again, thanks @legendecas, @JakobJingleheimer, @guybedford, @GeoffreyBooth, @jasnell

joyeecheung avatar Dec 04 '24 17:12 joyeecheung

Rebased to resolve conflicts in module.md (this is the second time now, kinda prove my point about what backporting nightmare it can be if the doc has to be overhauled in this PR..)

Can you take a look again? Thanks @legendecas, @JakobJingleheimer, @guybedford, @GeoffreyBooth, @jasnell

joyeecheung avatar Dec 09 '24 10:12 joyeecheung

CI: https://ci.nodejs.org/job/node-test-pull-request/63946/

nodejs-github-bot avatar Dec 09 '24 10:12 nodejs-github-bot

CI: https://ci.nodejs.org/job/node-test-pull-request/63947/

nodejs-github-bot avatar Dec 09 '24 10:12 nodejs-github-bot

CI: https://ci.nodejs.org/job/node-test-pull-request/63960/

nodejs-github-bot avatar Dec 09 '24 17:12 nodejs-github-bot

CI: https://ci.nodejs.org/job/node-test-pull-request/63963/

nodejs-github-bot avatar Dec 09 '24 19:12 nodejs-github-bot

CI: https://ci.nodejs.org/job/node-test-pull-request/63964/

nodejs-github-bot avatar Dec 09 '24 21:12 nodejs-github-bot

Landed in 2960a5954047f44b517855e57fb3a79f0a201fd3...1215a8bf124603a906c7bf748dee4f7f22cc6dd7

nodejs-github-bot avatar Dec 09 '24 23:12 nodejs-github-bot

The https://github.com/nodejs/node/labels/notable-change label has been added by @joyeecheung.

Please suggest a text for the release notes if you'd like to include a more detailed summary, then proceed to update the PR description with the text or a link to the notable change suggested text comment. Otherwise, the commit will be placed in the Other Notable Changes section.

github-actions[bot] avatar Dec 10 '24 15:12 github-actions[bot]

Note to releasers: please use the OP text (modulo background) for summary in the change log. The graph can be left out I suppose.

This depends on at least two refactoring PRs that landed before: https://github.com/nodejs/node/pull/54769 and https://github.com/nodejs/node/pull/55590 .

joyeecheung avatar Dec 10 '24 15:12 joyeecheung

This PR does not land cleanly on v22.x-staging and will need manual backport in case we want it in v22.x.

ruyadorno avatar Dec 20 '24 22:12 ruyadorno

I think it's worth backporting to v22 (since the whole point is to get more people off CJS monkey patching), though I am not 100% sure the refactoring here plays well with er, existing CJS monkey patching (I tried my best to make it play, but one never knows how crazy the patching can be). So we should wait and see how it goes in 23 for a bit before backporting.

joyeecheung avatar Dec 21 '24 00:12 joyeecheung

This needs to be backported together with https://github.com/nodejs/node/pull/56402, and https://github.com/nodejs/node/pull/56454

joyeecheung avatar Feb 04 '25 12:02 joyeecheung

@joyeecheung hey, I see that require.resolve is calling Module._resolveFilename, which does not have resolveWithHooks calling. Even the opposite, resolveForCJSWithHooks is calling Module._resolveFilename. Which means sync hooks will not intercept require.resolve, although docs implies it should work. It seems like resolveWithHooks logic should be implemented internally in Module._resolveFilename.

goloveychuk avatar Apr 16 '25 09:04 goloveychuk