Make node_modules read-only for slightly better hermeticity?
This is an enhancement idea for yarn_install and npm_install in rules_nodejs.
While using sandboxfs, I was surprised to find that some Node code, including our beloved Angular CLI (or rather one of its dependencies), writes into node_modules at runtime (!!). I had naïvely assumed that ~all Node code treats node_modules as pure input.
(For Angular CLI, there is a change already merged to make it possible to configure this (ab)use of node_modules to not happen (https://github.com/angular/angular-cli/issues/15865). I will likely open an issue over there requesting that this be the default rather than only a configurable possibility.)
For the sake of hermeticity and clear inputs/outputs to build steps, perhaps yarn_install and npm_install should mark its output files read-only after postinstall scripts finish. This would enforce a bit more hermeticity even without sandboxfs. Moreover, it would mean users find out right away about their non-hermetic node scripts, rather than only when picking up sandboxfs.
The main alternative to this idea is the current situation. We all mostly think of node_modules as an input to various scripts/build steps, but it is actually a potential shared global variable.
(There is a use case that had me running Angular CLI inside Bazel - of course I am well aware of, and mostly use, the Angular Bazel rules.)
I think the bazel sandbox, at least on Linux, has already take care of that? Writing to the source tree should just fail.
I filed https://github.com/angular/angular/pull/33366 about a case where ng build writes to node_modules.
It fails when run under the sandbox, yes. Note that there are different sandboxes depending on worker, strategy, and OS https://github.com/bazelbuild/bazel/issues/6111
I don't think it's our job as integrators of node and Bazel to determine who is doing something wrong. You may have great arguments for why writing to node_modules is a bad practice, but if software does that, we should try to make it work under Bazel.
@soldair had the idea of capturing such writes in our --require shim and saving them until the action/build is complete, print a warning letting the user know that some writes weren't applied, and print a command they can run to apply those edits after the build (like a --fix model)
Another possibility is to have the node_modules transformation in a proper action, so that a different node_modules tree is available to subsequent actions. But this requires modeling your build steps around which programs want to write to the node_modules so it's too much buy-in for most users.
@alexeagle How about a setting for the Bazel sandboxfs support that adds a mutable layer atop the readonly projection of inputs. The changes could be either discarded, or could create a target output to do as you described, model the flow of node_modules changes explicitly in Bazel. Could this accomplish the same as the @soldair idea above, in a way compatible with essentially anything in the ecosystem that writes to node_modules?
It would certainly be a pain to use; my sense is that Bazel users are potentially willing to deal with it (a few macros might make it pretty minor), to escape from the mutable-everywhere world they live in now.
No longer in scope for rules_nodejs which only supplies the Node.js toolchain as of v6.0.0.
Downstream canonical JavaScript + Node.js ruleset is now https://github.com/aspect-build/rules_js.