rancher-desktop
rancher-desktop copied to clipboard
Long wait and network errors on npm install in Rancher container
Actual Behavior
Running npm ci
with a large number of packages results in over 5 minutes of wait and EHOSTUNREACH
or EAI_AGAIN
errors.
Steps to Reproduce
- Clone the reproduction repo: https://github.com/hdodov/mcve-rancher-npm-hang
- Run
npm run dev
- You should see one of the errors I got
Result
First attempt:
[+] Building 308.6s (8/9) docker:default
=> [payload internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 182B 0.0s
=> [payload internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> [payload internal] load metadata for docker.io/library/node:18-bullseye-slim 2.6s
=> [payload internal] load build context 0.0s
=> => transferring context: 72B 0.0s
=> [payload 1/5] FROM docker.io/library/node:18-bullseye-slim@sha256:f06f9ff9dcc5cfec936a7bf085 0.0s
=> CACHED [payload 2/5] WORKDIR /app 0.0s
=> CACHED [payload 3/5] COPY package-lock.json . 0.0s
=> CACHED [payload 4/5] COPY package.json . 0.0s
=> [payload 5/5] RUN npm ci 306.0s
failed to solve: Unavailable: error reading from server: EOF
Second attempt:
[+] Building 624.4s (9/9) FINISHED docker:default
=> [payload internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> [payload internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 182B 0.0s
=> [payload internal] load metadata for docker.io/library/node:18-bullseye-slim 3.4s
=> [payload 1/5] FROM docker.io/library/node:18-bullseye-slim@sha256:f06f9ff9dcc5cfec936a7bf085 0.0s
=> [payload internal] load build context 0.0s
=> => transferring context: 72B 0.0s
=> CACHED [payload 2/5] WORKDIR /app 0.0s
=> CACHED [payload 3/5] COPY package-lock.json . 0.0s
=> CACHED [payload 4/5] COPY package.json . 0.0s
=> ERROR [payload 5/5] RUN npm ci 558.7s
------
> [payload 5/5] RUN npm ci:
549.4 npm WARN deprecated [email protected]: Please upgrade to version 7 or higher. Older versions may use Math.random() in certain circumstances, which is known to be problematic. See https://v8.dev/blog/math-random for details.
549.9 npm WARN deprecated [email protected]: Please upgrade to version 7 or higher. Older versions may use Math.random() in certain circumstances, which is known to be problematic. See https://v8.dev/blog/math-random for details.
558.5 npm notice
558.5 npm notice New minor version of npm available! 10.2.3 -> 10.3.0
558.5 npm notice Changelog: <https://github.com/npm/cli/releases/tag/v10.3.0>
558.5 npm notice Run `npm install -g [email protected]` to update!
558.5 npm notice
558.5 npm ERR! code EAI_AGAIN
558.5 npm ERR! syscall getaddrinfo
558.5 npm ERR! errno EAI_AGAIN
558.5 npm ERR! request to https://registry.npmjs.org/test-exclude/-/test-exclude-6.0.0.tgz failed, reason: getaddrinfo EAI_AGAIN registry.npmjs.org
Here's an error I got when manually entering the container shell and running npm ci
:
root@fe61caa23d72:/app# npm ci
npm WARN deprecated [email protected]: Please upgrade to version 7 or higher. Older versions may use Math.random() in certain circumstances, which is known to be problematic. See https://v8.dev/blog/math-random for details.
npm WARN deprecated [email protected]: Please upgrade to version 7 or higher. Older versions may use Math.random() in certain circumstances, which is known to be problematic. See https://v8.dev/blog/math-random for details.
npm WARN deprecated @types/[email protected]: This is a stub types definition. sharp provides its own type definitions, so you do not need this installed.
npm notice
npm notice New minor version of npm available! 10.2.3 -> 10.3.0
npm notice Changelog: https://github.com/npm/cli/releases/tag/v10.3.0
npm notice Run npm install -g [email protected] to update!
npm notice
npm ERR! code EHOSTUNREACH
npm ERR! syscall connect
npm ERR! errno EHOSTUNREACH
npm ERR! request to https://registry.npmjs.org/async/-/async-2.6.4.tgz failed, reason: connect EHOSTUNREACH 104.16.25.34:443
This is the complete npm log of this run: npm-EHOSTUNREACH.log.
However, when I manually ran it later, it worked:
root@5e7e6a9d2f90:/app# npm ci
npm WARN deprecated @types/[email protected]: This is a stub types definition. sharp provides its own type definitions, so you do not need this installed.
npm WARN deprecated [email protected]: Please upgrade to version 7 or higher. Older versions may use Math.random() in certain circumstances, which is known to be problematic. See https://v8.dev/blog/math-random for details.
npm WARN deprecated [email protected]: Please upgrade to version 7 or higher. Older versions may use Math.random() in certain circumstances, which is known to be problematic. See https://v8.dev/blog/math-random for details.
added 1423 packages, and audited 1424 packages in 43s
224 packages are looking for funding
run `npm fund` for details
found 0 vulnerabilities
Expected Behavior
The whole container build should take about 50 seconds, as it does with OrbStack and Docker Desktop.
Additional Information
No response
Rancher Desktop Version
1.12.1
Rancher Desktop K8s Version
deactivated
Which container engine are you using?
moby (docker cli)
What operating system are you using?
macOS
Operating System / Build Version
macOS Sonoma 14.3
What CPU architecture are you using?
x64
Linux only: what package format did you use to install Rancher Desktop?
None
Windows User Only
No response
This might be related to #6358, which also appears to occur due to network errors in the container.
I can reproduce this behaviour with QEMU. I tried giving the VM more resources by running with 8 CPUs and 16GB of memory, but the result was unchanged.
It seems to work fine when using VZ emulation (and virtiofs
for the volume sharing), but the build tooks still over 150s instead of the expected 50s:
$ npm run dev
> dev
> docker compose up --build
[+] Building 153.2s (11/11) FINISHED docker:rancher-desktop
=> [payload internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> [payload internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 219B 0.0s
=> [payload internal] load metadata for docker.io/library/node:18-bullseye-slim 0.9s
=> [payload auth] library/node:pull token for registry-1.docker.io 0.0s
=> [payload internal] load build context 0.0s
=> => transferring context: 146B 0.0s
=> [payload 1/5] FROM docker.io/library/node:18-bullseye-slim@sha256:f06f9ff9dcc5cfec936a7bf08580c5d9cf0f424fbbd857d36 0.0s
=> CACHED [payload 2/5] WORKDIR /app 0.0s
=> CACHED [payload 3/5] COPY package-lock.json . 0.0s
=> CACHED [payload 4/5] COPY package.json . 0.0s
=> [payload 5/5] RUN npm ci 149.7s
=> [payload] exporting to image 2.6s
=> => exporting layers 2.6s
=> => writing image sha256:7dfde214f314799120a5e9894e9188db49ce7130b2f5077a070eab65ffc8e0cd 0.0s
=> => naming to docker.io/library/mcve-rancher-npm-hang-payload 0.0s
[+] Running 2/2
✔ Network mcve-rancher-npm-hang_default Created 0.0s
✔ Container mcve-rancher-npm-hang-payload-1 Created 0.0s
Attaching to payload-1
I can reproduce this behaviour with QEMU. I tried giving the VM more resources by running with 8 CPUs and 16GB of memory, but the result was unchanged.
I forgot to mention: the Docker socket forwarding seems to have stopped while running npm run dev
with QEMU, so the result is essentially just a timeout after 5min.
I could still get a shell inside the VM, and the moby engine was still running, but the image had not yet been built.
It looks like buildkit
is not properly enabled by default in Rancher Desktop; enabling it explicitly in the compose file brings the build time down to 31s (after a factory reset, but again using VZ and virtiofs
):
--- docker-compose.yml
+++ docker-compose.yml
@@ -3,5 +3,7 @@ services:
payload:
build:
context: ./
+ args:
+ DOCKER_BUILDKIT: 1
ports:
- 3000:3000
and
$ npm run dev
> dev
> docker compose up --build
[+] Building 37.7s (11/11) FINISHED docker:rancher-desktop
=> [payload internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> [payload internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 219B 0.0s
=> [payload internal] load metadata for docker.io/library/node:18-bullseye-slim 1.6s
=> [payload auth] library/node:pull token for registry-1.docker.io 0.0s
=> [payload internal] load build context 0.0s
=> => transferring context: 1.11MB 0.0s
=> [payload 1/5] FROM docker.io/library/node:18-bullseye-slim@sha256:f06f9ff9dcc5cfec936a7bf08580c5d9cf0f424fbbd857d36 2.3s
=> => resolve docker.io/library/node:18-bullseye-slim@sha256:f06f9ff9dcc5cfec936a7bf08580c5d9cf0f424fbbd857d3677cff575 0.0s
=> => sha256:5ae20c369ec182f5275f874054411f9bfb3f8132ff52a9c1d08d4ae15494d01c 30.06MB / 30.06MB 0.8s
=> => sha256:01274fed796f4fcde21eb139c21923f0577bf03c6aca3de5fb6cca7997c026c7 4.19kB / 4.19kB 0.2s
=> => sha256:66f24c6ac6abeb092db3801126aeedab6af5fb7289ad2c67fdf187e36b658e1d 38.67MB / 38.67MB 0.7s
=> => sha256:f06f9ff9dcc5cfec936a7bf08580c5d9cf0f424fbbd857d3677cff575ff340db 1.21kB / 1.21kB 0.0s
=> => sha256:9440185cd6578a314ca5d32b315bea09d55c2050442590ace64c121170c009e8 1.37kB / 1.37kB 0.0s
=> => sha256:5b233b1f1a3b8a53bebaf24ca9d537508f23e4a1f5bf6d77e5aa334e8b42a7fc 7.64kB / 7.64kB 0.0s
=> => sha256:e8d32e99fce4825fc4fa6badbf7514943681c2fc1bcdab3f10c0b3047541122b 2.70MB / 2.70MB 0.5s
=> => sha256:1a8818020ea28916649820eb93d259bce8ee9113797d5d2dbbbc6378612fabe1 448B / 448B 0.6s
=> => extracting sha256:5ae20c369ec182f5275f874054411f9bfb3f8132ff52a9c1d08d4ae15494d01c 0.6s
=> => extracting sha256:01274fed796f4fcde21eb139c21923f0577bf03c6aca3de5fb6cca7997c026c7 0.0s
=> => extracting sha256:66f24c6ac6abeb092db3801126aeedab6af5fb7289ad2c67fdf187e36b658e1d 0.8s
=> => extracting sha256:e8d32e99fce4825fc4fa6badbf7514943681c2fc1bcdab3f10c0b3047541122b 0.1s
=> => extracting sha256:1a8818020ea28916649820eb93d259bce8ee9113797d5d2dbbbc6378612fabe1 0.0s
=> [payload 2/5] WORKDIR /app 0.1s
=> [payload 3/5] COPY package-lock.json . 0.0s
=> [payload 4/5] COPY package.json . 0.0s
=> [payload 5/5] RUN npm ci 31.0s
=> [payload] exporting to image 2.7s
=> => exporting layers 2.7s
=> => writing image sha256:7e939011152059397661a7baef53465e78b9132d2f0593b2852cecbee51da2cf 0.0s
=> => naming to docker.io/library/mcve-rancher-npm-hang-payload 0.0s
[+] Running 2/2
✔ Network mcve-rancher-npm-hang_default Created 0.0s
✔ Container mcve-rancher-npm-hang-payload-1 Created 0.0s
Attaching to payload-1
^CGracefully stopping... (press Ctrl+C again to force)
I did one more test with VZ emulation, but still using reverse-sshfs
for the volume sharing. It worked the same way as using virtiofs
, so the problem seems to be due to QEMU.
In summary there are 2 issues:
- the docker socket forwarding gets disconnected with QEMU (breaking)
- buildkit is not configured correctly to be used by default (slowdown)
The workaround for now is to use VZ emulation and configure buildkit explicitly.
Let's use #1541 to deal with the buildkit configuration and keep this issue for dealing with the broken docker.sock
when using QEMU.
It looks like
buildkit
is not properly enabled by default in Rancher Desktop;
This turned out to be wrong. The reason that buildkit was not being used was because I still had a custom builder installed from investigating a different issue, and that builder was also made the default:
$ docker buildx ls
NAME/NODE DRIVER/ENDPOINT STATUS BUILDKIT PLATFORMS
amd64builder * docker-container
amd64builder0 rancher-desktop running v0.12.5 linux/arm64, linux/amd64, linux/amd64/v2
default docker
default default running v0.10.6+d52b2d5 linux/arm64, linux/amd64, linux/amd64/v2
rancher-desktop docker
rancher-desktop rancher-desktop running v0.10.6+d52b2d5 linux/arm64, linux/amd64, linux/amd64/v2
The builder config is not deleted by a factory reset (I've created #6408 to change this), so it affected my testing for this issue as well by running the build inside a container.
After deleting the custom builder, the docker compose up --build
worked normally and compled the npm ci
command in under 40s.
Deleting the builder did not help with the QEMU issue though; it would still lose the socket connection and eventually time out.
I am having the same issue. Using QEMU the my Docker crashes. Using VZ at least install finishes but it is taking a very long time. Running with npm install --loglevel verbose
i can see that the HTTP GET calls are taking progresivelly longer ang longer (eventually each GET is more than 1 minute).
I'm having this issue using Mac Os M1 Sonoma 14.1.1. Rancher Desktop using QEMU virtualization used to work 😢
I'm having the same issue on Mac OS 14.4.1 with intel core. Earlier docker build with npm install used to work.
Is there any update here? We too are seeing users running into this issue where with Docker Desktop the builds would be 1/10 the time.
Is there any update here?
No update, unfortunately. Nobody has had time to work on it yet.
@jandubois Downgrading Rancher Desktop will not solve the problem. A factory reset led me to the same problem and critical blocked. The Rancher desktop is not linked to the strict tested version of Qemu, but on factory reset it downloads the latest?
Intel Mac i7, MacOs Monterey 12.7.4