rfcs [RRFC] Clean up file ownership story

[RRFC] Clean up file ownership story

Open ruyadorno opened this issue 2 years ago • 30 comments

We currently have an entire category of issue reports around unexpected changing of file ownership. We need a place to start the discussion on what we can do to improve the story around file ownership management: Maybe we need less of it these days?

Opening this RRFC so that we can have a place to share thoughts, link related issues to and hopefully eventually evolve towards an RFC if we come up to the decision that changes are necessary.

cc @nlf @fritzy

Mar 03 '22 21:03 ruyadorno

Auditing and aligning this sounds great.

I would expect that npm has full ownership (and thus, full freedom to change whatever it likes) of any node_modules, and $HOME/.cache, and things inside npm root -g, and its own installation folder.

Things a user is explicitly asking npm to change - .npmrc, package.json, lockfiles - npm of course can change, but shouldn't change ownership (it should error, if that is required; warn if not)

I would also expect that any other file is something npm is not "allowed" to change in any way.

Mar 03 '22 21:03 ljharb

AFAIK the existing file ownership changes are so that things like sudo npm install in a directory that's owned by your user doesn't write a bunch of files owned by root, both to your project and potentially to your cache.

This doesn't feel like something we should be attempting to forcibly fix on a user's behalf. The stat calls to determine ownership, and the chown calls to modify the ownership when it differs from what we infer are not free. We are most certainly taking a performance hit with these, especially in environments like Docker where filesystem i/o can be significantly slower.

Attempting to change permissions of things like the directory global bin scripts are located in can be absolutely disastrous.

Is there prior art for the type of permissions changes we attempt? I suspect there's not, but personally I also pretty strongly believe we shouldn't be attempting to change ownership ourselves.

Mar 03 '22 21:03 nlf

Is it possible to make sudo npm install immediately exit with an error?

Mar 03 '22 21:03 ljharb

potentially, but that wouldn't stop things like a user elevating their permissions to root manually and installing there. we absolutely can't explode when npm is run as root unconditionally, as that's often the case in a containerized environment.

Mar 03 '22 21:03 nlf

a suggestion: what if npm doctor can help diagnose this class of problem? i.e. if some other npm command errors with something related to filesystem permissions, our error output can suggest the user runs npm doctor. that would then do things like examine the permissions of files in node_modules and the cache directory and either link to docs or display in-line how the user can correct the permissions themselves

Mar 03 '22 22:03 nlf

after actually running npm doctor myself for the first time in quite a while, it already does permissions checks

Perms check on cached files         ok
Perms check on local node_modules   ok
Perms check on global node_modules  ok
Perms check on local bin folder     ok
Perms check on global bin folder    ok

so all that's missing is making our error output suggest running doctor, and doctor telling you how to fix the problem

Mar 03 '22 22:03 nlf

We could have a --no-root-user default for non-global installs, and then ci users would have to run --root-user. In the same release, we'd drop all ownership checks, and just write files as the current user. This would, of course, be a major breaking change.

ENOROOT: running `npm` as root is discouraged. Use `--root-user`to override. [link to docs]

Mar 04 '22 00:03 fritzy

adding this as another data point https://github.com/npm/cli/issues/4312

Mar 08 '22 20:03 nlf

Rather than dropping perms to nobody in this circumstance (which breaks compilation or yields surprising ownership issues), can we drop perms to $SUDO_USER instead?

Sent from my iPhone

On Mar 3, 2022, at 19:08, Nathan Fritz @.***> wrote:

We could have a --no-root-user default for non-global installs, and then ci users would have to run --root-user. In the same release, we'd drop all ownership checks, and just write files as the current user. This would, of course, be a major breaking change.

ENOROOT: running npm as root is discouraged. Use --root-userto override. [link to docs]

— Reply to this email directly, view it on GitHub https://github.com/npm/rfcs/issues/546#issuecomment-1058686129, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABTIUCYLNTMJ6UH4VVY2R3U6FH7LANCNFSM5P3WILHA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you are subscribed to this thread.Message ID: @.***>

Mar 08 '22 21:03 wesgarland

Rather than dropping perms to nobody in this circumstance (which breaks compilation or yields surprising ownership issues), can we drop perms to $SUDO_USER instead?

I think that's not a good idea, not everyone is using sudo. If I ssh root@myserver this var won't be set because I'm "really" root. This is not a good practice, but it's the default on many hosters

Mar 15 '22 16:03 theredcat

Adding two more references :

https://github.com/npm/cli/issues/4010
https://github.com/npm/cli/issues/3705

Mar 15 '22 16:03 theredcat

how about this for a default behavior:

flowchart TD
A[root runs command] --> B{SUDO_UID/SUDO_GID are set?};
B -- Yes --> C[setuid/setgid to SUDO_UID/SUDO_GID] --> E;
B -- No --> D{--allow-root passed?};
D -- Yes --> Z[log warning] --> E[proceed running command];
D -- No --> F[log error, exit];

when logging an error and exiting, we could use a message along the lines of

npm ERR Running npm commands as root is considered dangerous.
To run as a different user, you may provide the '--uid' and '--gid' flags.
If you're absolutely sure you want to run this command as root use the '--allow-root' flag.
For more information: https://some/link/to/the/docs

and the warning could read something like

npm WARN Running npm as root. For more information: https://some/link/to/the/docs

May 02 '22 23:05 nlf

@nlf You didn't wrote it, but I assume that you are suggesting the above default behavior + removing npm chmod ?

If that's the case it seems pretty nice ! It will break a lot of misconfigured docker containers but I think this is not a problem since no use application should be run as root, even in a container.

People will discover the USER docker directive and that's for the best :)

May 03 '22 06:05 theredcat

IMO a flag would be a simple yet effective solution to this problem. I'm thinking about something like --no-infer flag that simply skip the inferOwner call here

May 04 '22 06:05 robertsLando

Leaving this on the agenda for today, @nlf would be great if you could speak bit to your diagram/approach discussed here in the call but understand if you can't make it.

May 11 '22 15:05 darcyclarke

I'm extremely frustrated with this permission decision. Docker volume mounts use the UID:GID of the host machine. The changes npm has made to infer the execution user from the UID:GID effectively break docker setups where the host users UID:GID does not match the node user on the container. Setting UID:GID in a .env file for each developer in our application is cumbersome and overall ridiculous. Stop trying to infer best security practices when it comes to running scripts, your job is to be a package manager

May 12 '22 22:05 javabudd

I'm extremely frustrated with this permission decision. Docker volume mounts use the UID:GID of the host machine. The changes npm has made to infer the execution user from the UID:GID effectively break docker setups where the host users UID:GID does not match the node user on the container. Setting UID:GID in a .env file for each developer in our application is cumbersome and overall ridiculous. Stop trying to infer best security practices when it comes to running scripts, your job is to be a package manager

to be very clear this permissions decision was made many years ago, it is by no means a new thing, and that's exactly why this issue exists. this is something that we know to cause problems, so we're trying to determine the best way to fix that without immediately causing builds to break for people who have worked around the problem already. we'll be starting to roll out some changes to help folks in your situation in the near future.

May 12 '22 22:05 nlf

Actions

[ ] @nlf / @ruyadorno to make an RFC for these changes

May 18 '22 18:05 darcyclarke

Hey all,

having spent a whole day debugging an issue caused by silently setting the effective UID of npm subprocesses to <owner of the current directory> yesterday I'd like to propose an alternative to this behavior:

we absolutely can't explode when npm is run as root unconditionally, as that's often the case in a containerized environment

That's fair, but how about exploding when npm is run as root and the current directory is owned by non-root? This seems strictly preferable to silently switching users.

Jun 09 '22 12:06 jstasiak

I also agree that silently switching users is a great source of pain. I for example lost about one hour because of this. Please find a solution that stops doing this, thank you all very much :)

Jul 28 '22 14:07 papb

I think npm shouldn't be looking at its own UID, it should not modify file ownerships and permissions, and it should specially not switch users after it's called. That last point, when run as root, breaks long standing assumptions of a process behavior.

It's up to the caller of npm to decide what user they want to run it as, even if their decision is bad.

If I run npm install as root, npm will create files in node_modules and in my cache that belong to root. That is not a bug in npm, and it's not something for npm to fix. If I don't want to have those files owned by root, then I shouldn't be running npm install as root to begin with.

Aug 12 '22 17:08 ateijelo

@javabudd We face(d) a similar issue.

I'm extremely frustrated with this permission decision. Docker volume mounts use the UID:GID of the host machine. The changes npm has made to infer the execution user from the UID:GID effectively break docker setups where the host users UID:GID does not match the node user on the container. Setting UID:GID in a .env file for each developer in our application is cumbersome and overall ridiculous. Stop trying to infer best security practices when it comes to running scripts, your job is to be a package manager

echo "module.exports.sync = function() { return { uid: 0, gid: 0 } };" >> $(npm root -g)/npm/node_modules/infer-owner/index.js

This is the workaround we cooked up on a ubuntu based docker (so your path may be different)

It modifies the behavior of how npm detects which user it should run as. We this issue when using webpack in a VOLUME mount, while writing the output to a another, in-container only folder owned by root. In our case, forcing UID=0 is enough, so we echo append that to the module.

(We are still evaluating this solution, but looks workable)

Aug 25 '22 13:08 slickss-nsul

Still unclear, why NPM tries to lchown files that have nothing to do with Node. In my case, it is a link in /usr/local/bin to Microsoft Defender Antivirus. Please fix it.

% sudo npm install -g [email protected] Password: npm ERR! code EPERM npm ERR! syscall lchown npm ERR! path /usr/local/bin/mdatp npm ERR! errno -1 npm ERR! Error: EPERM: operation not permitted, lchown '/usr/local/bin/mdatp' npm ERR! [Error: EPERM: operation not permitted, lchown '/usr/local/bin/mdatp'] { npm ERR! errno: -1, npm ERR! code: 'EPERM', npm ERR! syscall: 'lchown', npm ERR! path: '/usr/local/bin/mdatp' npm ERR! }

% ls -l /usr/local/bin/mdatp lrwxr-xr-x 1 root admin 78 Aug 23 13:37 /usr/local/bin/mdatp -> /Applications/Microsoft Defender.app/Contents/Resources/Tools/wdavdaemonclient

Sep 23 '22 20:09 gnida-rada

From what I can see, it goes back to the times when it was suggested that npm should be run as root. The idea was that then npm can switch to nobody when running package scripts. A couple more of relevant links.

Was that a good idea or not?.. I personally think that npm shouldn't decide under which user to run whatever a user wants to run (or at least provide an escape hatch, at some point it existed). That makes npm more complex, its behavior more nuanced, and it's not npm's responsibility (npm is not in a position to do that).

I was beaten by it a couple of times, and it always took me a while to figure out what's going on. Today it was like this (in a docker container):

$ npx serve
sh: serve: Permission denied

sh? Permission denied?

If that's the case it seems pretty nice ! It will break a lot of misconfigured docker containers but I think this is not a problem since no use application should be run as root, even in a container.

People will discover the USER docker directive and that's for the best :)

So let's say I have a docker-compose project that bind mounts . into a container (to sync what's on the host and what's inside a container). And to make it work I create a user in Dockerfile with UID 1000 (which happens to be my UID on the host), give it to my colleague (which has UID 1001 on his/her host), and now what? I have to make people specify their UID to run a project? Doesn't it look like npm takes too much responsibility only to make peoples' lives harder? Compare it to the case where npm doesn't switch users and I have nothing to do on my part.

how about exploding when npm is run as root and the current directory is owned by non-root? This seems strictly preferable to silently switching users.

see the above case

echo "module.exports.sync = function() { return { uid: 0, gid: 0 } };" >> $(npm root -g)/npm/node_modules/infer-owner/index.js

That's a rather radical way to deal with it, but in a way I like it :)

we're trying to determine the best way to fix that without immediately causing builds to break for people who have worked around the problem already

Do you have an idea how to work around it? (Or you're just trying to make it as close to the previous behavior as possible?) After dropping unsafe-perm the only way is probably the one above. So, if we do it quick, noone will notice :) Or why not just drop the whole infer-owner/switch-users thing in the next release? I bet the builds that'll break... that'll make people happy that one workaround less is needed.

Oct 15 '22 07:10 x-yuri

Or why not just drop the whole infer-owner/switch-users thing in the next release?

This is going to be the behavior in npm 9. Wednesday's prerelease should have this available for testing.

Oct 17 '22 14:10 wraithgar

we have stopped altering file ownership at runtime on the user's behalf in npm@9 entirely.

i think we have some follow up work to do here to try to help users not end up in a situation where they have files unexpectedly owned by root. things like disabling the cache and log files if the cache directory is not owned by the same user that's running npm would go a long way towards not making users have to keep running chown -R to fix ownership

Dec 13 '22 22:12 nlf

Thanks for implementing this long overdue change. At last npm is more predictable if ran from root.

However, this is caused us a lot of pain. :) Indeed, our docker builds started failing as soon as we started using npm v9. The official Node documentation does not work any more in some scenarios: https://nodejs.org/en/docs/guides/nodejs-docker-webapp/

For others, who stumbled upon Docker build errors like

npm WARN tar TAR_ENTRY_ERROR EINVAL: invalid argument, fchown

or (for multi stage builds)

failed to copy files: failed to copy directory: Error processing tar file(exit status 1): Container ID 166037 cannot be mapped to a host ID

here is how I fixed it.

The build container should start from this:

FROM node:18 as builder

# Where all the commands are going to be executed
WORKDIR /usr/src/app

# Create user `node`
USER node

# The npm v9 does not automatically chown directofy for us, so we have to do it ourselves
USER root
RUN chown -R node /usr/src/app

# This would run any following npm commands NOT from root. It is highly recommended to never run npm as the root user.
USER node

# Copy all the files from your host (laptop?) current directory (except .dockerignore file matches)
COPY . .

RUN npm ci

And end with this:

# This will make sure the the next COPY --from=build docker command can be successfully executed
USER root
RUN chown -R root:root /usr/src/app

The next container in the multi-stage docker build should be like this:

FROM node:18-alpine

WORKDIR /usr/src/app

# ... other commands go here ...

# Copy files from previous build stage to this (last) build stage
COPY --from=builder /usr/src/app .

# Run the process under this user (because it is root by default)
USER node

Hope this helps someone in the future.

Feb 14 '23 11:02 koresar

@koresar Copying files from Debian to Alpine? That won't work if any of your packages contain anything binary.

USER node doesn't create a user, it switches to a user if it exists. Try USER non-existent-user.

A lot of comments (Clean Code, chapter 4) but the really unclear part is uncommented:

# This will make sure the the next COPY --from=build docker command can be successfully executed
USER root
RUN chown -R root:root /usr/src/app

That is, there's a comment but it doesn't tell why one should avoid doing that. From what I can see, there should be no problem with copying files owned by non-root. And the owner of the destination files is determined on the receiving end.

The way I'd do it (a gist):

FROM node:18-alpine as node_modules
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm install

FROM node:18-alpine
WORKDIR /app
RUN chown node: .
COPY --chown=node --from=node_modules /app/node_modules node_modules
COPY --chown=node . .
USER node

But okay. That's npm@8. It however still works with npm@9:

FROM alpine:3.17 AS node_modules
RUN apk add nodejs npm
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm install

FROM alpine:3.17
RUN apk add nodejs shadow && useradd node
WORKDIR /app
RUN chown node: .
COPY --chown=node --from=node_modules /app/node_modules node_modules
COPY --chown=node . .
USER node

All in all, I don't rule out that I'm missing something here, but the way it looks to me at the moment, in addition to the first issue (Debian -> Alpine), there's a lot of unnecessary movement in your solution. And mine is a better starting point. But if I'm missing something, can you possibly tell under which conditions it'll fail? How to reproduce the issues you mentioned? Imagine a person running into them. Should he/she change the whole file to fix them? Instead of "add this line," "change this line", or "don't do this because..."?

Feb 15 '23 03:02 x-yuri

For past 9 years of using Dockerfile(s) I thought USER node would crate me a user and a group.

I tried USER nonexistentuser as you suggested. And it worked flawlessly. Did you try it yourself? :)

My base image is node:18 (plus its alpine version) because it's the one recommended by OpenJS Foundation. I'm not going to bake my own images with apk and whatnot. Too laborious.

Thank you for your COPY --chown=node tip. That's handy.

Let's not drive away the discussion in this issue from the npm to my coding challenges.

I appreciate your help.

Feb 16 '23 06:02 koresar

@koresar

I tried USER nonexistentuser as you suggested. And it worked flawlessly. Did you try it yourself? :)

FROM node:18-alpine
USER nonexistentuser
RUN set -x && whoami && id

$ docker build .
...
Step 3/3 : RUN set -x && whoami && id
 ---> Running in 5eb31625a61b
unable to find user nonexistentuser: no matching entries in passwd file

Also, from the official docs:

The USER instruction sets the user name (or UID) and optionally the user group (or GID) to use as the default user and group for the remainder of the current stage.

My base image is node:18 (plus its alpine version) because it's the one recommended by OpenJS Foundation. I'm not going to bake my own images with apk and whatnot. Too laborious.

As you probably noticed, my main suggestion was to use node:18-alpine. But apparently my node:18-alpine image had npm@8, and I didn't think to try anything more specific (like node:18.14.0-alpine), so for the sake of the argument I used alpine:3.17. But generally yes, stick to node:... until you can't.

Let's not drive away the discussion in this issue from the npm to my coding challenges.

Coding habits probably? Actually, I was planning to give a more detailed reply back than, but then decided to focus on what I thought was important. Also, I think it's often hard to draw the line between what one thinks is important and what is considered important. So everything here should be taken as my personal opinion, which you might share or not.

One more thing that came to mind. You said, "For those who stumbled into the following issues... Here's how I fixed it: ..." And then you provide what looks like a Dockerfile to start from. Not a diff.

The point being, if you wanted to help people who stumbled into specific issues, you should have told what you changed. Like, "I had this line here, and after I changed it to this, it resolved." Ideally, you should have provided "steps to reproduce the issues," or something along those lines.

If you wanted to provide a Dockerfile to start with (a starting-point Dockerfile), then that is a different story.

And to me it looks like you more or less failed at both (no offense meant). The solution is not clear (what exactly solves the issues). And it's not clear what's missing in my version of Dockerfile. But well, it can probably provide some ideas to both ends.

Feb 21 '23 19:02 x-yuri

rfcs rfcs copied to clipboard

[RRFC] Clean up file ownership story

Actions

rfcs
rfcs copied to clipboard