node icon indicating copy to clipboard operation
node copied to clipboard

Trying to load two modules with the same URL throws an empty object

Open bpstrngr opened this issue 1 year ago • 8 comments

In a loader module I'm handling sources with omitted type import attribute on JSON imports, by assigning context.importAttributes.type in the load hook, while caching the outputs.

When the attributes are consistently omitted from JSON imports, or consistently aren't, this works well. But when the import signature changes on one occasion, the process breaks with an empty error object ({}) on returning the cached output.

I can't tell the reason for this, for as far as the loader is concerned, the load hook output should be idempotent whether the input context.importAttributes.type is defined or not:

[Object: null prototype] {
  format: 'json',
  responseURL: 'file:///home/ranger/loaded.json',
  source: <Buffer 7b 22 61 22 3a 31 7d 0a>,
  shortCircuit: true
}

Here's a loader module to reproduce:

// loader.js
export const address = new URL(import.meta.url).pathname;
export const file = address.replace(/.*\//, "");

// register loader
if (
  !globalThis.window &&
  process.execArgv.some((flag) =>
    new RegExp("^--import[= ][^ ]*" + file).test(flag),
  )
)
  Promise.all(
    ["module", "worker_threads"].map((module) => import(module)),
  ).then(
    ([{ register }, { MessageChannel, isMainThread }]) =>
      isMainThread &&
      [new MessageChannel(), import.meta.url].reduce(
        ({ port1, port2 }, parentURL) =>
          register(address, parentURL, {
            data: { socket: port2 },
            transferList: [port2],
          }) || port1.on("message", console.log),
      ),
  );

function initialize({ socket }) {
  socket.postMessage(" Registered loader module:\n " + import.meta.url + ".");
}

const load = function (source, context, next) {
  if (this[source])
    // return cached value.
    return console.log(this[source]) || this[source];
  if (source.endsWith("json")) context.importAttributes.type = "json";
  return next(source, context).then(
    (context) =>
      // cache output
      (this[source] = Object.assign(context, { shortCircuit: true })),
  );
}.bind({});

export { initialize, load };

and here's a file with inconsistent attribute signatures to load (in my real use case the inconsistency was in separate modules, one importing the other):

// loaded.js
 import json1 from "./loaded.json";
 import json2 from "./loaded.json" with {type:"json"};

console.log(json1===json2);

// loaded.json
{"a":1}

Result with node 21.1.0 (only notable difference is <Buffer > being empty in nextLoad's output already on the first import): node --watch --import=./loader.js ./loaded.js

 Registered loader module:
 file:///home/ranger/loader.js.
(node:1498) ExperimentalWarning: Importing JSON modules is an experimental feature and might change at any time
(Use `node --trace-warnings ...` to show where the warning was created)
[Object: null prototype] {
  format: 'json',
  responseURL: 'file:///home/ranger/loaded.json',
  source: <Buffer >,
  shortCircuit: true
}

node:internal/process/esm_loader:40
      internalBinding('errors').triggerUncaughtException(
                                ^
{}

Node.js v21.1.0
Failed running './loaded.js'

bpstrngr avatar Feb 07 '24 16:02 bpstrngr

May not seem like a huge deal at first, but if i intend to allow defaulting to json import attribute for imports of json extensions, which is arguably reasonable, if someone commits this inconsistency in the future, they would face this very hardly tracable error again (took a while for me to pinpoint where the problem occurs along the import chain).

bpstrngr avatar Feb 07 '24 16:02 bpstrngr

loosely related to nodejs/loaders#163 , in being my second encounter with an untracable error message from beyond the "load" hook in a load sequence.

bpstrngr avatar Feb 07 '24 17:02 bpstrngr

@nodejs/loaders

GeoffreyBooth avatar Feb 07 '24 17:02 GeoffreyBooth

I think that's the documented behavior: https://github.com/nodejs/node/blob/ee5cfcb5f7797a48e144a72148d81b2474600f37/doc/api/module.md#L529-L532

aduh95 avatar Feb 13 '24 09:02 aduh95

~~ah so the error actually is that import statements are statically analysed beyond "load", and it isn't matching the cached respective "resolve" entry?~~ or, more likely, i figure it rather happens not after the "load" hook at all, but before the respective subsequent "resolve" hook, where the new signature mismatches with the cached value (so no explicit static analysis involved).

does it have to throw an empty object at that point though? can't it spell out "import signature for {source} in {parentUrl} at {line:character} doesn't match the cached entry to resolve."?

or if the intention behind including it in the cache key is that it should resolve different (even if the same) files by the given specifier, shouldn't it just differentiate and proceed with the hooks? which will then determine if they want to re-equivocate them from their own cache or sthing?

bpstrngr avatar Feb 13 '24 13:02 bpstrngr

export const address = new URL(import.meta.url).pathname;
export const file = address.replace(/.*\//, "");

// register loader
if (
  !globalThis.window &&
  process.execArgv.some((flag) =>
    new RegExp("^--import[= ][^ ]*" + file).test(flag),
  )
)

You probably already know this, and I suppose it's your way for protesting lack of a good API in Node.js, but this is horrendous – and very fragile. Please, I'm begging you, don't do that in real code.

does it have to throw an empty object at that point though? can't it spell out "import signature for {source} in {parentUrl} at {line:character} doesn't match the cached entry to resolve."?

The empty object is definitely not supposed to happen, if someone wants to investigate that that'd be awesome.

or if the intention behind including it in the cache key is that it should resolve different (even if the same) files by the given specifier, shouldn't it just differentiate and proceed with the hooks? which will then determine if they want to re-equivocate them from their own cache or sthing?

Both imports should resolve to the same module, by spec, so loading a different module is not a supported use-case. Resolving modules is the mission of the resolve hook.

aduh95 avatar Feb 13 '24 13:02 aduh95

haha you may assume more of me, either in smarts that i know what you mean or in stubbornness that i'm protesting; i don't see the fragility there, is it about the regexp? maybe the [^ ]* could be a bit more accurate... not really sure how to improve on it from top of my head. sorry about the distress, i'm all ears if you can specify what else makes it "fragile". edit: i see the "+file" could also be problematic, with some crazy url provided... and it not being escaped in the first place lol... that's noted. i shall just match the file path from the flag and check for equality.

Thanks for confirming the rest.

ps. if the specifier alone should exhaustively indicate module identity, i wonder why the attribute is part of the key. maybe it's just a hash of the arguments.. still my interpretation of import attributes was that it wouldn't be impossible for a file to contain ambiguous format (in a possible future where it's used for more than json, to load ambiguous syntaxes such as binaries, archives, symbolic links, vanilla vs type-annotations when native support lands for that, etc). anyway that's wild speculation, it's perfectly reasonable to assume one file stays in one format. What's important is that in conclusion according to the spec you quoted then, the loader should let the resolve hook decide what importAttributes to return, and not fail before entering it. all clear.

bpstrngr avatar Feb 13 '24 14:02 bpstrngr

i don't see the fragility there, is it about the regexp?

Yes, file names can contain characters that have a special meaning in the context of a regex. It's also fragile in the sense it can only detect --import=./relative/path/to/module.js, and breaks in other scenario (e.g. it won't catch any of --import ./relative/path/to/module.js, --import=somePackage/subpath/export, --import=#subpath/imports) and if Node.js ever introduces a an alias for --import, it won't detect it either. All that to say: you should not try to detect if a file was --imported, it's not a supported use case. If you think you have a good use case, you should open an issue to propose adding a new API.

ps. if the specifier alone should exhaustively indicate module identity, i wonder why the attribute is part of the key. maybe it's just a hash of the arguments..

If you take the following code:

await Promise.all({
  import('someSpecifier'),
  import('someSpecifier', { with: { type: 'json' }}),
]);

You can imagine the race condition where we know someSpecifer can only resolve to a JSON module or a non-JSON module, but because load is async, we can't know which until one has succeeded, so we have to use the import attributes as part of the resolve cache.

aduh95 avatar Feb 13 '24 22:02 aduh95