security-wg
security-wg copied to clipboard
Permission Model
Permission Model initial issue
Hello everybody!
Following up on the Security Model initiative and the Mini summit (Next-10) in April seems and consensus that Node.js aims to have a security permission system, thus, avoiding third-party libraries to access machine resources without user consent.
This system was previously researched by James Snell and Anna Henningsen[1], which resulted in an excellent material as a starting point.
For context, the material is available through those links:
- Adding a permission system to Node.js blog post
- https://github.com/nodejs/node/pull/33504
- https://github.com/nodejs/node/pull/22112
Constraints
This security model is not bulletproof, which means, there are constraints we agree on before implementing this system:
- It’s not a sandbox, we assume the user trusts in the running code
- Example: the user sets the flag
--allow-fs-readwithout specifying a scope. Any external library can make use of it leading to a potential exploit. In this case, with the user's consent.
- Example: the user sets the flag
- No break-changes are ideal.
- It must add a low/no overhead when disabled and low overhead when enabled.
Points to be discussed
This is a big topic that could lead to several discussions topics. In order to avoid unnecessary discussions at this moment, let's use the following 3 topics as boundaries for this first iteration.
- Should it be
processor module scoped?
The effort to make it work with modules is worth it? Example:
{
"name": "package-name",
"version": "1.0.0",
"description": "",
"main": "index.js",
"permissions": [
"fs"
]
}
Then the user should provide permission for each module (as you need to do in your smartphone apps)
- Should the user be able to change the permissions in runtime?
If yes, how does it behave in an Asynchronous Context? Example:
const { setTimeout } = require('timers/promises');
console.log(process.policy.check('net')); // true
process.nextTick(() => {
process.policy.deny('net');
setTimeout(1000).then(() => {
console.log(process.policy.check('net')); // false
});
});
process.nextTick(() => {
setTimeout(1000).then(() => {
console.log(process.policy.check('net')); // ???
});
});
- What would be the desired granularity level?
--policy-deny=fsor--policy-deny=fs.in?--policy-deny=netor--policy-deny=net.tcp.in?- Should the user be capable to restrict specific TCP ports, for instance, denying all inbound packages at port 3000?
- Is
Denogranularity enough?--allow-read--allow-net
cc/ @jasnell @Qard @mcollina @mhdawson
Thanks for opening the topic - for questions 1 and 2, I have two concerns on the direction on the answers:
- Should it be process or module scoped?
tbh, I am more scared by the performance hit or feasibility of a module-scoped approach. Do we have guidelines on how to reach that?
- Should the user be able to change the permissions in runtime?
If so, we need to find a way to make sure potential malicious code does not rewrite policies to its own interests at runtime.
tbh, I am more scared by the performance hit or feasibility of a module-scoped approach. Do we have guidelines on how to reach that?
Yes, being module-scoped certainly would add such as big complexity. I'm not sure about the performance impacts, though. I think is feasible to look at a possible implementation and see if it's, at least, reasonable to do in the core.
If so, we need to find a way to make sure potential malicious code does not rewrite policies to its own interests at runtime.
Yes, good point.
Maybe we can start with an MVP and then start iterating over it when the need arises. What do you think? I mean, I can start an MVP with:
- Process scoped
- No API to change permissions in runtime
- Broad granularity level (same as Deno does)
And then, along this discussion, I can see the viability of the further updates
Thanks for the thread @RafaelGSS! 🙌
Should it be process or module scoped?
I suggest process model. We keep a simpler scope for the MVP.
There are some complex scenarios if we want to support modules, as seems hard deal with a big dependency tree like:
My projects dependes on package-a and package-b. I want that library-a can modify files but the opposite for package-b, then my settings for modules are:
[
{
"name": "package-a",
"permissions": [ "fs" ]
},
{
"name": "package-b",
"permissions": null
}
]
But then package-a and package-b depends on package-c, how does the modules policy works in this scenario ? Because from package-a should be fine to use fs but not from package-b
Should the user be able to change the permissions in runtime?
I agree with @vdeturckheim malicious packages can do modifications. In the other hand we might need that flexibility in some cases, so... maybe we can assume that by default is not possible to modify the policies without restarting the application at least you specifically allow that behaviour (and assume the risk) by using an specific flag like --allow-policy-changes
What would be the desired granularity level?
+1 to Deno model (at least for the first iteration) and then find if we really have a solid feedback to add more complex features.
For 1, I think the process-scoped is the only viable answer. Module-scoped is going to be far too complex and disruptive to do, and there's really no clear must have use case that would justify it.
For 2, I would argue that a model where permissions can only be restricted at runtime would be appropriate. That is, the process starts with a given set of permissions, within the code, there are options for restricting permissions further (either something like process.permissions.deny('net') that immediately turns it off for the entire process or something like process.permissions.deny('net', () => { ... }) where the callback passed in runs with the more restrictive... with my preference being on the former).
For 3, keeping the model similar to Deno's is great but it's going to be very difficult to change later if we don't make it granular enough. Let's be sure to give this particular bit a good deal of thought.
- Should it be process or module scoped?
Multiple threads could run with different policy (with caveats), but that's not terribly different from process-level policies. So yes, process scoped. Module-level policies aren't possible to enforce without non-trivial barriers between them and even at that point you run into logical nightmares as pointed out by @UlisesGascon.
- Should the user be able to change the permissions in runtime?
Much like with other well-known and well-used permissions systems, code ought to be able to decide it can drop privileges, but never be able to grant itself any expanded privileges.
If yes, how does it behave in an Asynchronous Context?
It should be applied synchronously, much like other things that modify process-level behaviour do today. For example, changing environment variables or using process.setuid() are synchronous operations that take effect immediately. Attempting to have policies follow through with async context (like AsyncLocalStorage does) means invoking async_hooks and the performance hit from that, and ensuring the correctness of async_hooks, even in weird security scenarios, which is non-trivial at the very least.
If I'm reading the example correctly, both calls to check() happen at least 1000ms after deny() has been called, so under my suggestion, the second check would also return false.
- What would be the desired granularity level?
Filters on any given API should eventually give the ability to filter on individual operations and individual resources. For prior art on this, see https://web.archive.org/web/20190821102906/https://intrinsic.com/docs/latest/index.html, which describes a (now-defunct, but previously working) policy system for Node.js.
An MVP blocking entire subsystems is fine for an MVP so long as the configuration format doesn't preclude any further detail in filtering.
I think we must provide some level of restrictions on the file system or what host/port a process can connect or listen to.
As an example, I might want to limit the folder in which a process write files to the tree of folders from the nearest package.json. This will severely limit the surface attack of a malicious dependency.
What were actually the points that prevented the previous iterations. Given the current discussion, I am under the impression that at least one prior PR did most of what we are heading to?
The thing that stopped the prior effort was lack of engagement... We couldn't get anyone else to engage in the conversation enough to move it forward. Hopefully the timing is better this time
What would happen if we start node with --policy-deny=net and the main process spawns a child process? Will the child process be able to access the net?
@RaisinTen ... see the discussion on that point here:
A core part of Node.js is the ability to spawn child processes and load native addons. It does no good for Node.js to restrict a process’s ability to access the filesystem if the script can just turn around and spawn a separate process that does not have that same restriction! Likewise, given that native addons can directly link to system libraries and execute system calls, they can be used to completely bypass any permissions that have been denied.
The assumption, then, is that explicitly denying any permission should also implicitly deny other permissions that could be used to bypass those restrictions. In the above example, invoking the node binary with --policy-deny=net would also restrict access to loading native addons and spawning child processes. The --policy-grant would be used to explicitly re-enable those implicitly denied permissions if necessary.
In other words, a process that has --policy-deny=net will not be able to spawn a child process by default. If the user explicitly allows it to spawn a child process, then it will be the users responsibility to pass along the correct arguments.
Personally I would be really interested in some sort of module-scoped solution, though I feel like that would likely be very complicated to manage given you'd have to handle permission delegation across the dependency graph. You'd probably need not only permissions to use certain things in a module but then also permission to delegate those permissions to its dependencies, and the config would likely get super complicated.
I feel like process-scoped permissions are too simplistic though. A user might turn on net access because one module needs it but then some other unrelated module does something nefarious and doesn't get blocked because net was allowed. 🤔
Perhaps an import/require-based system where a module needs permission to load another module? So it would just be unable to attempt to load a module it's not supposed to be able to use, throwing a permission error on import/require. It wouldn't be too terribly difficult to specify something like --allow=http:fastify,ws to specify that the http module should be allowed to be used directly by the fastify or ws modules, but no others.
I feel like process-scoped permissions are too simplistic though. A user might turn on net access because one module needs it but then some other unrelated module does something nefarious and doesn't get blocked because net was allowed. thinking
Perhaps an import/require-based system where a module needs permission to load another module? So it would just be unable to attempt to load a module it's not supposed to be able to use, throwing a permission error on import/require. It wouldn't be too terribly difficult to specify something like --allow=http:fastify,ws to specify that the http module should be allowed to be used directly by the fastify or ws modules, but no others.
As described by others, I'm not sure if the complexity to handle module-scoped permissions is worth it. It seems the user can prevent the above behavior by specifying the allowed net port. For instance, --allow-http=3000, then even if another unrelated module does something nefarious, it wouldn't be allowed due the port is already in use & only 3000 is allowed.
It seems to have a consensus in a MVP with:
- Process scoped
- Only drop privileges in runtime
- Flexible granularity (
--policy-deny-net||--policy-deny-net=3000) - (--policy-deny-fs||policy-deny-fs=/usr/sbin)
Are we all in agreement on that? In case, I'll create a fresh PR (probably using most of the work @jasnell did in: https://github.com/nodejs/node/pull/33504) and then we can iterate over that.
EDIT: Regarding the nomenclature (policy-deny= or policy-deny-net) we can discuss in the PR.
@qard
Any permission system with support for multiple sets of permissions requires sufficient separation between the permission sets and the units of isolation they apply to. This means the units of isolation cannot communicate with each other except through strictly controlled interfaces (meaning no shared objects, not even globals). Without this, privileges can easily leak between the units of isolation. We simply do not have this between modules, and any sensible approach to having this between modules requires massive changes in the way we both import and call code from other modules.
To illustrate this, consider a module a.mjs and module b.mjs. Assume that through whatever permission system we create, we give a access to net, but deny that to b. A poorly written or malicious a could just assign all the relevant net methods to the global, and then b can use it.
A common suggestion is to track module usage via the stack, but this is not performant at all. Even if it were, there are other ways to share functionality and have the calling module appear to be the one with adequate permissions. You'd have to block transitive access, and then the number of related problems here starts expanding.
Sure, there's likely always going to be ways to get around it. But process-scoped just means rather than needing to do some environment hacking if you aren't the specific module the user intended to have access, you will just have access automatically by inheriting it from the process-wide config needed for something else. So you really aren't protecting much of anything if users are just always turning on net access. It seems to me that to adequately control I/O you need either module-level blocking of use in some way or super granular process-wide control of interfaces like saying http requests can only be made to this specific list of URLs or file system access only allows reading these specific files and writing these other specific files.
I don't think we can escape significant complexity without making any sort of policy system essentially useless.
To deal with more granular levels like per-module, what we would need is proper sandboxing of isolates and global scopes. We could set it up so that worker_threads can be launched with either the same or more restrictive permissions than the current thread; and if we ever did introduce a properly sandboxed isolate mechanism when we could launch those with their own permissions also.
@Qard
It seems to me that to adequately control I/O you need either module-level blocking of use in some way or super granular process-wide control of interfaces like saying http requests can only be made to this specific list of URLs or file system access only allows reading these specific files and writing these other specific files.
Yep! And I'm saying the latter is the only actually feasible one. The approach currently suggested by @RafaelGSS seems to be not as granular for permissions as the ideal, but keeping room open for it.
@jasnell We could do that, but to then have it be per module we'd have to consider:
- The weight of all those isolates.
- Providing and securing the interfaces between the isolates (non-trivial).
Yep, which is not a level of complexity we should take on initially.
Could probably make some changes to the vm module to make contexts more isolated and attach security controls to that. I'm definitely getting a sense though that whatever we do we can't really escape that it's going to be a whole lot of work and require substantial changes to many parts of the platform to reach any level of maturity. For sure we can start with some MVP, but I suspect that won't actually be all that useful short of serving as a proof-of-concept.
This is a great discussion and this effort is a great way to raise the the overall security bar in the Node.js ecosystem.
I noticed that this discussion became very detailed very quickly. I'd like to highlight some areas that would be good to address in parallel with the implementation work.
Community outreach and documentation
Introducing fundamental security changes to a mature technology is a delicate task. The new model may be surprising to a lot of Node.js developers. I think that in parallel with implementing the feature we should be thinking about how to explain the changes to developers, how to convince them it is safe to use and valuable for their applications. We should have a good idea about how to guide them when to use the new mechanisms and how to introduce them gradually into existing applications.
Threat model
I wonder if there has been any prior work done in documenting specific threats and attacks this work would prevent? I feel having a well developed threat model would allow us to better communicate the value proposition as well as the limitations of the solution.
Security testing
The threat models should clearly indicate what types of attacks will be prevented when this work has landed in Node.js. I think it would be super valuable if we had a plan on how to engage the security research community during development of these changes to help us find weak spots and identify limitations of the approach and the implementation.
I know these issues are somewhat tangential to the actual implementation work, and I'll be more than happy to start separate issues to discuss them in more depth without disrupting this discussion.
Great approach @MarcinHoppe! 🙌
We are running a parallel discussion in Slack regarding the documentation of Security Model and Threat models and current state of art. I prepared a proposal in web format to start the discussion in the Security WG (How to structure the documentation, what should content...). It would be fantastic if you can join the discussion and help us validate and evolve the proposal 🙏
I'll chime in that discussion on Slack!
@MarcinHoppe Thank you! This strategy is also documented by https://github.com/nodejs/node/pull/42709.
IMO this is security theatre. There are two audiences for runtimes security. People running workloads on the cloud. And people running local dev tools on their machine. With the cloud we can't rely on runtime security for governance. Locking down Node won't do anything for the overarching infrastructure requirements so, in most cases, the vendor will have a concept called IAM which enables complete granular security surrounding the runtime. Said another way, this feature won't make sense for the cloud. So that leaves devs running local dev tools. This audience is even less likely to use granular permissions to get the job done and will more likely run with --allow-all or equiv.
Overall I'm against more complexity, without much concrete benefit, and especially in the code loading paths which are too slow already compared to other JS runtimes.
That's a fair point. I think the parallel discussion on Slack will bring relevant points towards this point of vision. I do feel that we can't generalize all the Node.js apps running in a cloud that ensures the application security using IAM, nor that local dev tools would run the application using --allow-all.
The developer environment is a relevant security threat, and Node.js doesn't provide a security mechanism to avoid malicious packages using host sensitive data
Threat model
This was discussed in the mini-summit and I think defining/documenting the security model as captured in https://github.com/nodejs/node/blob/master/doc/contributing/security-model-strategy.md#document-the-security-model and then the followon https://github.com/nodejs/node/blob/master/doc/contributing/security-model-strategy.md#document-threat-models-and-current-state-of-the-art are steps that benefits the Node.js project even if we don't enhance functionality but might also be the foundation for what @MarcinHoppe mentions as important for motivating/explaining any enhanced functionality. I think focusing on those to start will deliver value and make future discussion about specific enhancements easier.
Hey there!
To follow up on this, I've been working on implementing an MVP for the FileSystem access in the past weeks.
This is the roadmap I have in mind(Feel free to edit in case I'm missing something):
- [X] Create
process.policy.checkAPI - [X] Create
process.policy.denyAPI - [X] Restrict FileSystem API a when explicitly deny option is sent
- [x] Make it more granular - e.g:
fs.in=/home/user/allowed-read-folder,fs.out=/home/user/allowed-write-folder
This is the diff so far: https://github.com/nodejs/node/compare/master...RafaelGSS:feat/permission-system?expand=1.
Along with this implementation, I have created a document that aims to drive this API design. It's currently discussed in every Security WG Meeting (feel free to join if you are interested in this feature) - This is still under development.
Remember: no discussions to the nomenclature were made, don't pollute the issue with concerns regarding that. The MVP scope is: https://github.com/nodejs/security-wg/issues/791#issuecomment-1106581564
See: https://github.com/nodejs/node/pull/44004
It’s not a sandbox, we assume the user trusts in the running code
@RafaelGSS if we assume the user trusts in the running code, what problem is the permission model trying to solve?
Also, many of the comments discuss protecting against malicious code. Isn't that out of scope based on the assumption the user trusts in the running code?
Perhaps the user partially trusts the running code, and the feature is to let the user specify what the running code should or should not be trusted with (ex: trust it to read a document to process but not to access ~/.ssh or the network). In which case escape from the restrictions imposed should be impossible, even for malicious code, but granting the ability to launch other processes is granting the ability to escape restrictions. Is that the goal?
FWIW The @arhart question was answered in the Node.js Security WG.
@GeoffreyBooth I'm moving the concern you raised to this issue. It seems an appropriate place to have this discussion.
Thanks. I would also take a look at https://deno.land/manual/getting_started/permissions, there are a lot of good ideas there. For example, they separate the permissions for filesystem read and filesystem write; that's probably something we should do too. For network access, if it's possible to create separate permissions for full network access versus only responding to incoming requests, that would be a good distinction too. Such a permission would let you spin up a webserver that wouldn't be capable of exfiltrating data, like if one of your dependencies stole your environment variables on startup and posted them somewhere. (It would still be vulnerable to data exfiltration as part of responding to a request, but at least the other attack vector would be denied.)
Before creating this PR I did a long study leveraging other resources as part of the foundation (this is described by this issue, the Deno permission system was also mentioned). Other resources such as networks in will be handled on further pull requests. The way the code was designed, any additional module should be easier to implement.
Honestly, I don’t see the existing feature overlapping with this one. As said above, they might have a similar behaviour, but the purpose is different.
I think while they’re both experimental it’s not urgent to resolve the differences between the two; but I think a coherent user experience for users using both features together needs to be something that we prioritize as you’re developing this. It feels to me like something that should be worked out early on, in a design phase, not after you’ve already landed the first PR; but I won’t block on those grounds.
In general I think it’s bad UX to have two features that achieve the same result, and have similar-seeming intentions, but with completely different methods (a flag versus a config file). I understand that the two features don’t fully overlap, that each one does some things that the other doesn’t, but in a way that makes this even harder to resolve because you can’t just replace one with the other. I want to resolve the conflict of how to handle what overlap they do have as early as possible so that development can continue on both without significant breaking changes or refactoring needed by either team. I’m not sure I can attend the next security meeting; perhaps just open a discussion issue and we can hash it out async?
It's extremely important to mention that those features wouldn't achieve the same result. The policy system acts in the application bootstrap, you can deny access to modules, the permission system instead, allows the user to deny access to resources. Restricting what modules can be loaded is a separate problem from limiting what actions those can take on the operating system. To enhance this discussion I would love to hear @bmeck thoughts too.
IMO this feature (permission system) doesn't overlap with the policy system at all. I see it as a new security mechanism for developers. Both features can be used in a single application each one with its own purpose.