node icon indicating copy to clipboard operation
node copied to clipboard

module: add support for `nodeEntryPointConfig.loaders` in `package.json`

Open aduh95 opened this issue 1 year ago • 7 comments

Currently, starting a .mjs or .cjs file allow to skip the lookup and the parsing of a package.json file, I'm reluctant to change that but I'm curious what others think.

I had to make the default resolve function sync – but we don't use anything async inside anyway, I assume it's fine. If we move the loaders off-thread (or even to make the loading of loaders itself go through non-default loaders), we might need to think of an async way of resolving those loaders, not sure.

Opening as draft because it's just at the PoC stage, and I would like to get some feedback on the initial implementation before going further.

/cc @nodejs/loaders

aduh95 avatar Jul 24 '22 20:07 aduh95

Review requested:

  • [ ] @nodejs/modules

nodejs-github-bot avatar Jul 24 '22 20:07 nodejs-github-bot

Thanks for picking this problem up. Agreed having it not apply to .mjs and .cjs would be odd, and that needs to be addressed.

I do still think there might be confusion in the package.json if it will apply when loading a dependency. I tend to think a separate Node.js environment configuration manifest might make more sense.

Another possibility could be a special comment in the entry point file only. Similarly to what we have in the test files. That would then also solve the issue of standalone executions. That Node.js itself needs it for tests demonstrates it's a relatively useful feature.

guybedford avatar Jul 24 '22 20:07 guybedford

Currently, starting a .mjs or .cjs file allow to skip the lookup and the parsing of a package.json file, I’m reluctant to change that but I’m curious what others think.

Yes I think we need to change this. I assume the prior behavior was for performance? Doing a single stat (or stat lookup chain) for the entrypoint seems like a very minor hit.

I had to make the default resolve function sync – but we don’t use anything async inside anyway, I assume it’s fine.

Not opposed, but what made this necessary? This makes me think we should land the “move off thread” PR before we tackle this part.

Opening as draft because it’s just at the PoC stage, and I would like to get some feedback on the initial implementation before going further.

I feel like we should support all of Node’s configuration (or at least, as much of it as possible) rather than just a field for loaders. Within the configuration file we also need a way to define different configurations per environment, like production vs development configuration (perhaps aligning with --conditions?) since it’s very likely that users will have different loaders and other settings between prod and dev.

GeoffreyBooth avatar Jul 25 '22 01:07 GeoffreyBooth

@guybedford

I do still think there might be confusion in the package.json if it will apply when loading a dependency.

Note that the package.json field uses the wording "node entry point" to try to avoid that confusion. Do you think that's good enough?

I tend to think a separate Node.js environment configuration manifest might make more sense.

One reason to use package.json is that we have to parse it anyway, and it seems easier to explain that it follows the same rules as "type" field if it's in the same file.

Another possibility could be a special comment in the entry point file only. Similarly to what we have in the test files. That would then also solve the issue of standalone executions. That Node.js itself needs it for tests demonstrates it's a relatively useful feature.

That's smart, that would not work for entry point files that do not follow the same comment convention as JS though (e.g. a WASM file).


@GeoffreyBooth

I feel like we should support all of Node’s configuration (or at least, as much of it as possible) rather than just a field for loaders.

To be clear, that's very much not a goal for this PR, I intend to focus on resolving the loaders use-case only. Adding gradually more options should be possible though, and can happen in follow up PRs.

Within the configuration file we also need a way to define different configurations per environment, like production vs development configuration (perhaps aligning with --conditions?) since it’s very likely that users will have different loaders and other settings between prod and dev.

It's already possible to do the following:

{
  "name": "pkg",
  "imports": {
    "#loader": {
      "development": "typescript-loader/with-type-checks",
      "default": "typescript-loader/no-type-checks"
    }
  },
  "nodeEntryPointConfig": {
    "loaders": ["#loader"]
  }
}

I'm tempted to say it's good enough, wdyt?

Yes I think we need to change this. I assume the prior behavior was for performance? Doing a single stat (or stat lookup chain) for the entrypoint seems like a very minor hit.

It's more than a stat, we also have to parse that JSON, so the time it takes depends on the size of the package.json. But yeah, that might be fine in the end, users can still opt-out of that by using a command-line loader.

aduh95 avatar Jul 25 '22 09:07 aduh95

I’m tempted to say it’s good enough, wdyt?

No, that only works for --loader and other flags that expect a file path as an argument. Once we support defining --inspect in configuration, and we want that changed by environment, how would that be handled?

I think this API needs to be designed to eventually support all of Node’s config options (or at least, all the ones that make sense to define in a config file; probably all the ones supported by NODE_OPTIONS). It also needs to support sets of configurations such as per environment. We don’t need to support all 100+ options right away, but the API needs to be designed such that all the others can be added without causing breaking changes.

I think this should also be an experimental API, possibly even needing its own flag to enable. There’s no versioning in package.json, so we need to be very careful what we start telling users that we support in there.

I also think that we need to look at what flags get parsed by C++ on startup, long before we get to module loading. I think users will consider this feature broken if it doesn’t support most of the options they expect (such as everything in NODE_OPTIONS) and so we might need to move the “find the controlling package.json for the entry point” logic to a much earlier part of the bootstrap than the module loader. We should probably figure that part out first before implementing a solution that relies on the module loader and eventually needs to get removed when we add support for the other flags.

GeoffreyBooth avatar Jul 25 '22 16:07 GeoffreyBooth

I'm -1 on the specific name nodeEntryPointConfig as it's significantly too verbose.

This seems to be a part of a bigger/longer-term trend of Node.js begnning to use package.json for configuration more. If we're going down that path I'd strongly prefer we do something like node or nodeOptions or noderc in package.json and then having entrypoint with the rest of the API under that.

bnb avatar Jul 28 '22 15:07 bnb

If we're going down that path I'd strongly prefer we do something like node or nodeOptions or noderc in package.json and then having entrypoint with the rest of the API under that.

i wonder what number of people would expect, based on one or more of those names, that this configuration object would support nested configurations like exports. if it did, that would allow you to specify different environment variables for different sections of the program. that either sounds to me like a really cool or a really terrible idea.

zackschuster avatar Jul 28 '22 21:07 zackschuster

I'm not sure what to do with this, I'm not happy with the solution I came up, and it doesn't look like any other solution make consensus so far. Also, I feel trying to come up with a JSON representation that pleases everyone of any value that could configure Node.js is a doomed project, it doesn't seem realistic to set this goal.

I'm thinking a better approach would be to use .env file (it's already a widely use syntax, users are more likely to understand them without much effort from us, there are a ton of information online about it already, it is useful for more than just node config for the end user, etc.), the problem is that we don't have a parser to parse the value and use them, and I assume this would need to be parsed before V8 has started to allow its configuration, therefor happen in C++ land, which is outside of my capacity.

Anyway closing this for now, if someone else to give it a shot, you'll be very welcome. If you are looking for a solution to do that and don't care about Windows support, create your own executable:

#!/bin/sh
exec node --loader '#ts-loader' "${@}"

aduh95 avatar Sep 15 '22 12:09 aduh95

I’m thinking a better approach would be to use .env file

I’m leaning toward this as well. For comparison, Bun supports this approach:

bun automatically reads configuration from .env.local, .env.development and .env (in that order)

In our case, a .env file (or .env.local etc. file) would map to process.env, and so Node flags would be specified via NODE_OPTIONS='--whatever'. I feel like this should be relatively simple to implement, as it’s just breaking the string of the file into lines and then breaking the line into before/after the =, then adding the key and value to the environment. If we do this before the C++ code that reads NODE_OPTIONS, then I’d think we should be done?

We can always support .env and configuration files, as Bun does, so this doesn’t foreclose other solutions in the future.

GeoffreyBooth avatar Sep 16 '22 16:09 GeoffreyBooth

I mean, if .env support is on the table, I'm wholly +1 to that

bnb avatar Sep 17 '22 05:09 bnb