vscode-docker icon indicating copy to clipboard operation
vscode-docker copied to clipboard

Why move node_modules to parent directory?

Open nunovieira opened this issue 4 years ago • 10 comments

https://github.com/microsoft/vscode-docker/blob/018c92815ff2eeed50ec9e9af1d842ebf7a7a329/resources/templates/node/Dockerfile.template#L5 This line intrigues me. Why the mv node_modules ../ part? Why move node_modules up a directory? Should this be in a generic Node.js Dockerfile template?

nunovieira avatar Jun 16 '20 14:06 nunovieira

@philliphoff do you know why we do this?

bwateratmsft avatar Jun 16 '20 14:06 bwateratmsft

It's from before my time on the extension, but my guess is that it was (at least an attempt at) a Docker image build layer optimization. It may also have been done to allow the flexibility to bind-mount the local application source into the container (rather than always build within the container); by keeping the node_modules folder in the parent (in the container as well as locally), you no longer have to worry about platform-specific libraries on the local machine not matching the platform of the container (as described here).

A downside is that there are some (fairly popular even) Node libraries which do not work properly in that arrangement.

philliphoff avatar Jun 16 '20 16:06 philliphoff

I looked through Git history, it's been this way for a very long time.

@lostintangent more or less was the first to make it this way (up one level), albeit with changing to Yarn instead of NPM: https://github.com/microsoft/vscode-docker/pull/85 Then immediately changed it back to NPM: https://github.com/microsoft/vscode-docker/pull/86

Since then it's been moved a few times but the moving of node_modules up a level has persisted.

@lostintangent Do you remember the history behind this?

bwateratmsft avatar Jun 16 '20 17:06 bwateratmsft

As @philliphoff mentioned, this was done to allow users to mount a local copy of their codebase into the container, without the local copy needing to include the restored node_modules (which might also be built for an entirely different operating system).

That said, it’s been quite a while, so I’m not sure if this strategy is still appropriate 😁

lostintangent avatar Jun 17 '20 02:06 lostintangent

@nunovieira Does this arrangement affect one of your projects?

bwateratmsft avatar Jun 17 '20 13:06 bwateratmsft

This doesn't affect me because I don't use this anymore. I start the "dockerization" by getting the Dockerfile from the Node.js guide. It's simpler and better explained. I never move node_modules. For development, I bind mount the application directory and have an anonymous volume for node_modules. In a compose file, there's something like:

    volumes:
      - ./app:/usr/src/app:delegated
      - /usr/src/app/node_modules

To have intellisense in vscode, I copy the node_modules from the container to the host. That code will never be executed in the host, it's just to get the definitions and introspection stuff.

IMO, this template is to kick-start the initial development. It could have some comments to help first-timers get it to a production ready state.

nunovieira avatar Jun 17 '20 18:06 nunovieira

Yes, we can look at ways to improve the default Node Dockerfile with additional comments. @uchenkadicode has done quite a bit of that for the Python Dockerfile.

bwateratmsft avatar Jun 17 '20 19:06 bwateratmsft

So according to this blog post, it looks like it's so you can develop for two different systems at once (such as developing for a Windows host and a Linux container). But if you .dockerignore the node_modules/ folder, then moving it up one won't matter anyway? Even if a .dockeringore isn't present, according to the docker reference, COPY only "copies new files or directories". The word "new" stands out to me and makes it sound like if the file to be copied exists in the container, it won't be copied, meaning the entire node_modules directory would be ignored anyway?

I'm no docker expert, though. I was just curious as to why && mv node_modules ../ is present, and my hunting landed me here...

TheBrenny avatar Jun 29 '21 03:06 TheBrenny

The problem process is:

  1. It's common to bind-mount your host workspace to your container workspace to allow hot-reloading (e.g., - ./my-app:/app)
  2. The image is built, node modules are installed
  3. Bind mount from Step 1 is attached, but some/all dependencies are missing if the host node_modules differs from container's
  4. A common workaround it to use a data volume to exclude node_modules from Step 1. (e.g., - ./my-app/node_modules/). Although, a blank node_modules will still exist on the container.
  5. Data volumes (Step 4) are created once, meaning installing dependencies on host (maybe for the benefit of auto-writing to package.json) will not transfer to container.
  6. Solutions for Step 5 are:
  7. Step 6 Solution 3 means modules used in the scripts of package.json installed via node_modules will not work. E.g., nodemon server.js
  8. Solution to Step 7 is to use npx which (I believe) uses the same node_modules resolver pattern as Step 6 Solution 3

These steps result in a host-container environment with hot-reloading and detection/installation of dependency changes. Yay!

The irony of this is - when starting my new project:

  1. I immediately commented out the line RUN npm install --production --silent && mv node_modules ../ because I didn't understand Why move node_modules to parent directory?
  2. I spent hours understanding the problem
  3. In the end, I uncommented the line

I think y'all should leave it. It helps many people without them knowing the benefit.

The only thing to change would be maybe mv node_modules ../ to mv node_modules / so an absolute instead of relative path is used, which can trip up package.json execution in services like Google Cloud Run

grokpot avatar Jul 27 '22 17:07 grokpot

The problem process is:

  1. It's common to bind-mount your host workspace to your container workspace to allow hot-reloading (e.g., - ./my-app:/app)

  2. The image is built, node modules are installed

  3. Bind mount from Step 1 is attached, but some/all dependencies are missing if the host node_modules differs from container's

  4. A common workaround it to use a data volume to exclude node_modules from Step 1. (e.g., - ./my-app/node_modules/). Although, a blank node_modules will still exist on the container.

  5. Data volumes (Step 4) are created once, meaning installing dependencies on host (maybe for the benefit of auto-writing to package.json) will not transfer to container.

  6. Solutions for Step 5 are:

  7. Step 6 Solution 3 means modules used in the scripts of package.json installed via node_modules will not work. E.g., nodemon server.js

  8. Solution to Step 7 is to use npx which (I believe) uses the same node_modules resolver pattern as Step 6 Solution 3

These steps result in a host-container environment with hot-reloading and detection/installation of dependency changes. Yay!

The irony of this is - when starting my new project:

  1. I immediately commented out the line RUN npm install --production --silent && mv node_modules ../ because I didn't understand Why move node_modules to parent directory?
  2. I spent hours understanding the problem
  3. In the end, I uncommented the line

I think y'all should leave it. It helps many people without them knowing the benefit.

The only thing to change would be maybe mv node_modules ../ to mv node_modules / so an absolute instead of relative path is used, which can trip up package.json execution in services like Google Cloud Run

First of all, thank you for your speech!

But I'm sorry, I still don't understand why, can you give me a simple example?

zq0904 avatar Sep 02 '22 06:09 zq0904