node
node copied to clipboard
Environment variable and CLI argument handling in user-land startup snapshots
During the snapshot building process, the process.env
and process.execArgv
can be queried and captured in the snapshot, which may not match what the environment variables or CLI arguments actually are when the snapshot is deserialized.
For the environment variables queried by Node.js internals, we currently work around this problem by avoiding to cache these values in JS land and always run the pre-execution code to refresh whatever states that are dependent on environment variables or CLI arguments, so after the snapshot is deserialized, any code would get new values from process.env
and process.execArgv
that reflect the environment where the deserialized application is run. For user land access to these states in the builder script, the expectation is that users should do something similar - either avoid accessing them in the builder script to avoid capturing them in JS, or refreshing any cached states using the deserialize callbacks.
During discussions with @acutmore it seems that while this can be worked around using the solution described above, it would be better if Node.js provides more utilities for working with the environment variables. I think some APIs may be useful:
-
Some CLI flags to trace access to environment variables, so that users can have a better idea about what variables are used and potentially cached during the snapshot building process, and whether the access need to be delayed to run time or at least refreshed during deserialization. This seems useful in general, not necessarily limited to the startup snapshot feature.
-
An API that passes a list of environment variables accessed by the builder script to a user-provided serialization callback. Users can e.g. store a copy of those (if necessary they can encrypt sensitive values) as part of the snapshot, and specify a deserializer callback to check for mismatches using that copy or adjust accordingly. Or generate a dotenv file using that API to ensure that some important environment variables are always configured to be consistent with what's used by the builder script, etc. A rough sketch looks like this (more ideas welcomed!)
v8.startupSnapshot.getAccessedEnvironment((envVars) => { // envVars is an array containing the environment variables accessed // during snapshot building const valuesFromBuild = {}; // Users can special-case certain keys and encrypt the values if necessary. for (const key of envVars) { valuesFromBuild[key] = process.env[key]; } v8.startupSnapshot.addDeserializeCallback(() => { if (process.env.IMPORTANT_KEY !== valuesFromBuild.IMPORTANT_KEY) { // Mismatch, refresh the states, or throw? } }); // Users can also instruct the deserialized main function to go another route if // there are important mismatches. });
-
Some field in the snapshot configurations that can be used to allow-list access to environment variables in the builder script, or some field that be used to run an alternative main script if certain environment variables mismatch as a fallback (this should be opt-in since serializing environment variables can store sensitive information into the blob) - this is also a rough sketch of what the general idea may look like:
{ "envInBuilder": { "allowed": [...], // once specified, if the builder script access any environment variable // that's not in the list, the building process exit with a non-zero exit code. } }