node icon indicating copy to clipboard operation
node copied to clipboard

Provide some mechanism to conditionally and synchronously import modules (or just builtins) from ESM

Open jakebailey opened this issue 3 weeks ago • 15 comments

What is the problem this feature will solve?

Thanks to #51977, requiring ESM is looking to be a real possibility. As such, TypeScript is considering transitioning over to ESM in the near future (depending on when require(ESM) is unflagged, hopefully in time for TS 6.0?), as that sort of change would no longer pose compatibility problems for the vast number of downstream CJS users. This has a number of benefits, mainly that we could finally share code between tsc.js and our public API without a startup perf regression, and that we wouldn't be duplicating code in our package (thankfully only two copies remain as of TS 5.5, down from six copies in TS 4.9).

TypeScript's current public API bundle is intentionally "UMD-ish", detecting whether or not module.exports exists and using it (declaring a global otherwise), then later conditionally requiring built-in modules like fs if we believe to be running within Node. This allows us to ship one single bundle that works in Node, browsers, and bundlers alike.

However, the code that relies on conditional require is executed at the top-level as it's constructing the ts.sys object, the default "system" implementation for most of our APIs. Within CJS, this is fine, but within ESM, the only way to conditionally import something is by either:

  • Using top-level await (or doing it later asynchronously).
  • Importing createRequire from node:module.

Using top-level await breaks require(ESM), the whole reason we think we can use ESM in the first place, and TS is infamously not async and couldn't import it later. Importing node:module is a moot choice, since if we could safely import node:module, we could have just imported node:fs and so on.

So, we need some mechanism to synchronously import modules, or at least the builtins.

Given require(ESM) is now possible, it sure seems like there could be a way to safely synchronously import modules in ESM that are already require-able from CJS after #51977.

What is the feature you are proposing to solve the problem?

After discussing this in the TC39 Module Harmony meeting (with @guybedford @joyeecheung @JakobJingleheimer, others), there seemed to be a number of different paths forward:

  • Node or ECMAScript invent a new "weak" or "optional" import attribute (e.g. import fs from "node:fs" with { optional: true }) could be added; if the module fails to resolve, the imports are all undefined. This may require some sort of TC39 specing or proposal.
    • In the meeting this, this seemed palatable, though potentially had a too-long time horizon to be helpful for TS or other projects trying to do the same thing as us.
  • Node can add import.meta.require. This was previously proposed at https://github.com/nodejs/modules/issues/130, but unfortunately drags CJS into the ESM world (potentially no more than createRequire, I suppose).
    • In the meeting, this seemed "okay" in that it's something Node could offer "now", but less desirable than other options.
  • Node (+ whoever owns the import.meta spec) can add import.meta.importSync or import.now, which is effectively just await import(...) that only works on sync-loadable ESM.
    • In the meeting, this seemed less palatable than other options without a more general use case or more supporting examples.
  • Node can add import.meta.builtins or similar (e.g. on process), which just provide access to Node's builtin modules. Largely, TS only needs fs, path, os, etc, so this would sidestep the "sync import" problem altogether. TS also conditionally imports source-map-support, so that would not work, though only in development. Thankfully, since one could get node:module this way, you can also shim require via createRequire, which is pretty neat.
    • In the meeting, this seemed pretty palatable. There was mention of this being something useful for WinterCG or similar to WASM, but I'm not very qualified to fully understand that one. There was also discussion about whether or not it would be all-getters, since people do want to patch / mock builtins.

What alternatives have you considered?

TS could also use package.json import maps to achieve "conditional" imports of Node-specific code, e.g. have an import like #system which in the Node condition imports from Node, but is shims otherwise. This seems to have a number of downsides in my view, specifically:

  • TS is bundled and our outputs are not associated with our inputs, so actually writing said code may be pretty challenging.
  • There are platforms other than Node that implement enough of the Node builtins to be compatible. Would they set the node condition? Would TS need to explicitly add mappings for each of these? What happens if our code is bundled? These gotchas make me feel like this would be too difficult to deal with.
  • Do import maps work when TS is loaded via a browser? Our intent is that we can go down to having only one copy of our code, but I don't think browsers understand what to do with package.json import maps. In the meeting, @JakobJingleheimer mentioned that one could remap imports like node:fs to shims, even to data:... blobs, but I'm definitely not experienced enough in browser ESM to know how to do that.

jakebailey avatar Apr 19 '24 16:04 jakebailey

Given that it doesn't depend on the context, import.meta.builtins could also be process.builtins.

nicolo-ribaudo avatar Apr 19 '24 17:04 nicolo-ribaudo

Thanks for the summary here, in terms of timelines I just want to summarize my own sense of things:

  • import.now - there is committee discussion and interest here, so it isn't unviable, but there's a lot of implementation edges, so we're likely looking at at least a year or so before a viable proposal could even reach Stage 2, and another year or two to get to implementation, if it happens at all.
  • with { optional: true } - this would need to be tackled as a web or ecma spec, but could certainly be done in a timeline closer to within 4-8 months.
  • import.meta.getBuiltin - Node.js could implement and ship this any day.

Perhaps in time we do hit all three, or at least two of the above.

guybedford avatar Apr 19 '24 17:04 guybedford

process.builtins SGTM. I don't see the reason to put extra stuff in the import.meta namespace, which is shared across environments and runs the risk of collisions. Weak imports seem like a good medium-term improvement, but let's do those at the TC39 level (since this feature makes sense across environments).

littledan avatar Apr 19 '24 17:04 littledan

Something on process sounds totally fine to me; I also didn't realize when writing the above that you could also do:

const require = process.getBuiltin("node:module").createRequire(import.meta.url)

And be no worse off than before for conditionally loading anything outside of the builtins, which is very handy.

jakebailey avatar Apr 19 '24 17:04 jakebailey

I think the only concern I have for a process-based approach is how bundlers won't recognize these as being "imports" per se, so may not be able to shim them. But I suppose that bundlers would need to change no matter what option is implemented, since it's all new besides import.meta.require already being implemented in bun.

jakebailey avatar Apr 19 '24 17:04 jakebailey

I think whether the access is module-based depends on https://github.com/nodejs/node/issues/52575 (if policy is enabled, access control to builtins is different per module based on the policy. But then if policy is removed altogether...there is no need to worry about it. It's probably another reason to remove policy?)

joyeecheung avatar Apr 19 '24 23:04 joyeecheung

Thinking out loud:

Although, the removal of policy does not necessarily mean that there won't be any future pursuits of per-module access control - that is, if I have dependency A, B, C, D, I may want to be able to tell Node.js, only grant FS access to dependency A, and if B,C,D attempts to read from FS, they will error (it's more like so far people seem to agree that policy is not the right way to implement it). Or we can decide that per-module access control is already a dead end, and it’s fine to punch this hole to access shared builtins on the global now (it sort of is a dead end, for CJS, because the CJS loader is currently full of holes. But I am not sure about ESM? Or maybe since CJS is full of holes there's no point having a security feature that only works in ESM? Or maybe we still want to keep the door open for per-module access control many many years later when everyone moves to ESM or we fully deprecate CJS monkey patching and fill all the CJS holes?)

If this is global, I am not sure whether builtin IDs are the right identifiers for scope of capabilities (personally, they seem to be more wrong than right as scopes for capabilities). Another possibility is to add something that's more like process.capabilities or process.resources and make the properties there scoped better than just the builtins we have right now. That seems to be more complicated than necessary, we can just decide that IDs on this import.meta.builtins or process.builtins aren't how we are going to determine access control anyway, and we'll do something similar to the existing permission model - filter all the resource access in the native layer, and let users indicate the scope of control with an external mechanism, there will only be an error when the builtins used indirectly access the wrong resource, but builtins remain to be accessible to users.

My conclusion: probably could just do globalThis.process.builtins, but be aware that being a global this will make future pursuits of per-module builtin access control harder, though that's already difficult in itself for various other reasons which makes it not too non-realistic to be a blocker for something that satisfies an immediate need? But if moving it to import.meta.builtins doesn't cost us much compared to globalThis.process.builtins, might as well keep the door open instead of completely shutting it?

joyeecheung avatar Apr 19 '24 23:04 joyeecheung

Actually thinking through the implementation details of how to implement per-module access control for import.meta.builtins I realize that globalThis.process.builtins can still do it too, in the end something can be stored in the script context similar to the host-defined options used to implement import() (or at least that's what V8 plans to do), and in the native layer all the bindings can use this thing to determine resource control scopes for the module that invokes the builtin. That's going to be a lot more robust than doing it in the module loader anyway. So my conclusion is, globalThis.process.builtins won't shut the door for per-module access control, it's still possible to have both.

joyeecheung avatar Apr 20 '24 01:04 joyeecheung

I'm working on a proof of concept; my quick hack for now (until I can get all of it as ESM) is to use a --require preloaded script to patch process and a polyfill stuck on top of our bundle.

$ cat patchProcess.cjs 
process.createRequire = require("node:module").createRequire;
$ head ./built/local/tsc.mjs
var require, __filename, __dirname;
if (typeof process !== "undefined") {
  require = process.createRequire(import.meta.url);
  __filename = require("node:url").fileURLToPath(new URL(import.meta.url));
  __dirname = require("node:path").dirname(__filename);
}
$ node --require ./patchProcess.cjs ./built/local/tsc.mjs -p ./src/compiler --diagnostics
Files:              213
Lines:           245632
Identifiers:     412592
Symbols:         256043
Types:           103601
Instantiations:  189818
Memory used:    496843K
I/O read:         0.01s
I/O write:        0.00s
Parse time:       0.90s
Bind time:        0.38s
Check time:       7.15s
Emit time:        0.00s
Total time:       8.44s

Just throwing on createRequire is a good trick for now, given our code will just try and use require globally if it can. Rewriting everything to assume ESM is definitely possible, but will take a little more work than the few minutes I spent on this showing that it's possible to avoid TLA and yet still conditionally do things.

jakebailey avatar Apr 26 '24 04:04 jakebailey

So, that was a lot easier than I expected.

$ cat testRequireESM.cjs                                                                                                                                                                                                                       // Inject this into the process so that TS can synchronously require stuff.
process.createRequire = require("node:module").createRequire;

const ts = require("./built/local/typescript.js");
console.log(ts.version);
$ node --experimental-require-module ./testRequireESM.cjs
5.5.0-dev
(node:33109) ExperimentalWarning: Support for loading ES Module in require() is an experimental feature and might change at any time
(Use `node --trace-warnings ...` to show where the warning was created)

image

The only change to TS I need to make is to change our import extensions and fix up our system implementation to require process.createRequire. Other stuff is broken (in fixable ways), but this seems to show that a TLA-free ESM TypeScript API is possible without breaking downstream users.

Code is at: https://github.com/jakebailey/TypeScript/tree/its-esm

jakebailey avatar Apr 30 '24 03:04 jakebailey

Opened https://github.com/nodejs/node/pull/52762

joyeecheung avatar Apr 30 '24 17:04 joyeecheung

@nodejs/loaders

GeoffreyBooth avatar Apr 30 '24 19:04 GeoffreyBooth

Want to throw in a couple of non-builtins use cases I've come across, in case it's helpful:

  • In @opentelemetry/resources, there is a conditional import based on process.platform, so that the correct function depending on perform can be exported while being contained in a different module.
  • Something I've done is in a shared eslint config, conditionally importing either a typescript shared config or a base config depending on whether or not I've "enabled" it, allowing the external TS config to be an optional dependency.

luxaritas avatar May 01 '24 03:05 luxaritas

In the first one, you are already importing process; is this actually a conditional?

For the latter, you can actually export from ESLint configs a Promise, such that you can conditionally resolve things in an async IIFE, which may also work for those cases.

No matter what, though, you can always achieve loading of external modules even with just getBuiltin just by writing:

const require = process.getBuiltin("module").createRequire(import.meta.url);
// ...

jakebailey avatar May 01 '24 03:05 jakebailey

In the first case, I was referring to the require calls (exceptionally poor choice of words, sorry!). In both cases it didnt register that createRequire should be able to solve this issue - though WRT larger standards discussion I imagine it would be nice to have a way to do this in ESM outside of node.

luxaritas avatar May 01 '24 03:05 luxaritas