docker-node icon indicating copy to clipboard operation
docker-node copied to clipboard

[buildx/arm64] yarn and npm not able to download packages from internet

Open voydz opened this issue 3 years ago • 35 comments

Hi there,

thanks for your awesome work with this project. I am experiencing issues using buildx for the arm64 arch. Everything is running smooth until yarn or npm is trying to download packages over the network.

This is an snipped from build-log running on GitHub actions while using yarn:

#22 [linux/arm64 build-deps 5/6] RUN yarn --production
#22 176.3 info There appears to be trouble with your network connection. Retrying...
#22 195.3 error An unexpected error occurred: "https://registry.yarnpkg.com/date-fns/-/date-fns-2.16.1.tgz: ESOCKETTIMEDOUT".

Running the same buildx build locally the output is a bit more concise:

 => [linux/arm64 build-deps 5/6] RUN yarn --production
 => => # Unknown QEMU_IFLA_INFO_KIND ipip                                                                                                
 => => # Unknown QEMU_IFLA_INFO_KIND ip6tnl                                                                                              
 => => # yarn install v1.22.5                                                                                                            
 => => # Unknown QEMU_IFLA_INFO_KIND ipip                                                                                                
 => => # Unknown QEMU_IFLA_INFO_KIND ip6tnl                                                                                              
 => => # [1/4] Resolving packages...                                                                                   
 => => # [2/4] Fetching packages...                                                                                                      
 => => # Unknown QEMU_IFLA_INFO_KIND ipip                                                                                                
 => => # Unknown QEMU_IFLA_INFO_KIND ip6tnl                                                                                              
 => => # info There appears to be trouble with your network connection. Retrying...  

The command used to start the build: docker buildx build --platform linux/amd64,linux/arm64 -t myrepo/myproject:latest --push client/

My Dockerfile looks like this: (Note that I do not copy node_modules into build context. The files to copy are managed using a .dockerignore file.)

# Stage 1 - the build process
FROM node:latest as build-deps
WORKDIR /usr/src/app

ARG REACT_APP_GRAPH_URL
ENV REACT_APP_GRAPH_URL=$REACT_APP_GRAPH_URL

COPY . ./
COPY .env.example .env

RUN yarn --production
RUN yarn build

# Stage 2 - the deployment
FROM nginx:latest
COPY --from=build-deps /usr/src/app/build /usr/share/nginx/html

COPY docker/default /etc/nginx/conf.d/default.conf

Workaround

As a workaround I currently using a .yarnrc to create an offline cache. I copy a it into the build context and install it with the yarn --offline option. That works for now, but I do not like it very much. https://classic.yarnpkg.com/blog/2016/11/24/offline-mirror/

thank you again and best regards

voydz avatar Sep 23 '20 09:09 voydz

ARe you sure your docker is correctly configured to allow internet access to your containers?

LaurentGoderre avatar Sep 25 '20 20:09 LaurentGoderre

Hi @LaurentGoderre ,

As I wrote, the same holds true for running in GitHub Action environments. I already had this idea, so I verified the issue by running the build on other platforms. Also, in every environment amd64 builds just fine.

What do you think I can further do, to help with this issue?

Thanks and best regards

voydz avatar Sep 25 '20 20:09 voydz

@PeterDaveHello I noticed you added the yarn label to this issue. I also checked this using npm with the exact same behavior as yarn.

Currently I am trying to run some network diagnostic inside the container. If I stumble upon something useful, I will let you know.

best regards

voydz avatar Oct 21 '20 09:10 voydz

I think this is fairly reproducible. On my ubuntu machines, any Dockerfile that includes RUN npm .... will fail in this same way when either targeting an arm64 image, or running docker buildx build when the targets include an arm64 target.

I am not able to reproduce however on my macbooks.

goshlanguage avatar Jan 09 '21 21:01 goshlanguage

Let me guess – you're not building on an actual arm64 runner node which means the arm64 build is generated using QEMU. Emulation is slow, therefore there's much higher risk of running into timeouts.

It may be still possible there's an additional problem with the network connection which may be worth debugging (I do see ESOCKETTIMEDOUT in the logs) but I'm not sure if this is related to docker-node in any way.

JanJakes avatar Mar 15 '21 13:03 JanJakes

While this seems plausible, I think it's the common case that projects will build against an emulated environment. Would it not be an improvement to docker-node to perhaps update documentation in some way to call this out?

goshlanguage avatar Mar 15 '21 15:03 goshlanguage

I'm having similar problem. I have also started a SO thread

Sp4Rx avatar Apr 13 '21 12:04 Sp4Rx

got the same issue on arm instance to build amd64 image with emulator

vl-shopback avatar Sep 20 '22 04:09 vl-shopback

I had the same problem, but then I found this

docker run --privileged --rm tonistiigi/binfmt --install all

in the official documentation, which solved my problem. Further explanation will be found in the docs itself.

MunsMan avatar Feb 13 '23 08:02 MunsMan

@MunsMan I have a couple questions, it's no problem if you can't get back to me

  1. In what way did it solve your problem? No more npm install or similar timeouts?
  2. How did your GHA yaml file look in terms of order of operations?
    • Is the docker/setup-qemu-action@v2 action redundant?

LiterallyDad avatar Feb 23 '23 20:02 LiterallyDad

@LiterallyWar Maybe I missed up the Issue, but I had a pretty similar problem. My cross-platform builds for arm always failed while running the npm install or yarn command. Currently, I'm unable to recreate the error. If I recall correctly, I didn't have any timeout related output. It felt more like a platform emulation issue.

Regarding your second point, what do you mean with GHA? I have to add that I haven't looked into it further, but thought it may help someone.

MunsMan avatar Feb 24 '23 15:02 MunsMan

@MunsMan I was curious if how your .yml action file was organized - i.e. is it qemu -> setup buildx -> build and did you need the docker/setup-qemu-action@v2 action

We got things to work but holy cow it is too slow - $$$

LiterallyDad avatar Feb 24 '23 15:02 LiterallyDad

@LiterallyWar I was just able to reproduce my Error:

 => [internal] load build context                                                                                                                                         0.0s
 => => transferring context: 779B                                                                                                                                         0.0s
 => [linux/amd64 2/7] WORKDIR /app                                                                                                                                        0.3s
 => [linux/amd64 3/7] COPY [package.json, package-lock.json, tsconfig.json, ./]                                                                                           0.0s
 => CANCELED [linux/amd64 4/7] RUN yarn                                                                                                                                   6.1s
 => [linux/arm64 2/7] WORKDIR /app                                                                                                                                        0.3s
 => [linux/arm64 3/7] COPY [package.json, package-lock.json, tsconfig.json, ./]                                                                                           0.0s
 => CANCELED [linux/arm64 4/7] RUN yarn                                                                                                                                   3.2s
 => [linux/arm/v7 2/7] WORKDIR /app                                                                                                                                       0.3s
 => [linux/arm/v7 3/7] COPY [package.json, package-lock.json, tsconfig.json, ./]                                                                                          0.0s
 => ERROR [linux/arm/v7 4/7] RUN yarn                                                                                                                                     0.1s
------
 > [linux/arm/v7 4/7] RUN yarn:
#0 0.128 exec /bin/sh: exec format error
------
Dockerfile:9
--------------------
   7 |     COPY ["package.json", "package-lock.json", "tsconfig.json", "./"]
   8 |
   9 | >>> RUN yarn
  10 |
  11 |     COPY ["./src", "./src"]
--------------------
ERROR: failed to solve: process "/bin/sh -c yarn" did not complete successfully: exit code: 1

After running docker run --privileged --rm tonistiigi/binfmt --install all it runs perfectly.

I'm using docker buildx for building the images and takes around 2 minutes including publishing the images. I don't use docker/setup-qemu-action@v2 but I don't use GitHub Actions.

Furthermore, I run it locally, and I'm using Colima as the daemon.

MunsMan avatar Feb 24 '23 17:02 MunsMan

awesome thank you so much for your reply and the information - it's the GHA runners, they can't emulate for crap with that action I posted or with your example straight from the docker docs

I really appreciate the help getting to the bottom of this

LiterallyDad avatar Feb 24 '23 20:02 LiterallyDad

@LiterallyWar This error can be reproduced with these docker files & GH workflows as well.

https://github.com/frappe/frappe_docker https://github.com/frappe/frappe_docker/actions/runs/4401435320/jobs/7707642818

ChillarAnand avatar Mar 13 '23 06:03 ChillarAnand

@Munsman I try it, but it's not resolve problem for me. Have you needed to restart docker.service ?

HendrixNguyen avatar Apr 11 '23 11:04 HendrixNguyen

@HendrixNguyen I'm sorry, sadly I'm not an expert on this topic, I just posted my finding while having the problem. If I recall correctly, I didn't restart the docker.service. I ran the command and reran the build command.

MunsMan avatar Apr 14 '23 11:04 MunsMan

I had the same problem, but then I found this

docker run --privileged --rm tonistiigi/binfmt --install all

in the official documentation, which solved my problem. Further explanation will be found in the docs itself.

  • tried without cache
  • tried without qemu
  • tried with the docker run command mentioned above
  • using yarn install

not possible to run on github workflows

edit: Without anything else, this seems to fix it on our side https://github.com/JamesonRGrieve/Agent-LLM-Frontend/pull/14

localagi avatar May 17 '23 14:05 localagi

Meet the same issue on Github Actions Ci:

name: Build and Push Image to Docker hub

on:
  push:
    branches: ['main']
  workflow_dispatch:

jobs:
  docker:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout
        uses: actions/checkout@v3
      - name: Set up QEMU
        uses: docker/setup-qemu-action@v2
      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v2
      - name: Login to Docker Hub
        uses: docker/login-action@v2
        with:
          username: ${{ secrets.DOCKERHUB_USERNAME }}
          password: ${{ secrets.DOCKERHUB_TOKEN }}
      - name: Build and push
        uses: docker/build-push-action@v4
        with:
          context: .
          platforms: linux/amd64,linux/arm64
          push: true
          tags: usr/app:latest

Build and push linux/amd64 image cost about 2-3 mins, but build linux/arm64 cost about more than half an hour and always in the yarn install..... state.

sonofmagic avatar Jun 27 '23 13:06 sonofmagic

any workaround stuck here same 🥹🥲

Robokishan avatar Jul 06 '23 19:07 Robokishan

i'm stuck at yarn install too..

yongsk0066 avatar Jul 25 '23 05:07 yongsk0066

same issue only on arm64


#40 48.59 [1/4] Resolving packages...
#40 92.80 [2/4] Fetching packages...
#40 129.2 info There appears to be trouble with your network connection. Retrying...
#40 162.7 info There appears to be trouble with your network connection. Retrying...
#40 174.4 info There appears to be trouble with your network connection. Retrying...

jun0tpyrc avatar Jul 27 '23 06:07 jun0tpyrc

We are also seeing this issue. Pretty annoying, does anyone have any workarounds for github actions or do I just need to find a different CI provider to run this build?

manziman avatar Aug 23 '23 15:08 manziman

We are also seeing this issue. Pretty annoying, does anyone have any workarounds for github actions or do I just need to find a different CI provider to run this build?

As per the linked PRs, increase your network timeout and be extremely patient if running under buildx/QEMU via GitHub Actions. :-)

Alternatively, I guess attach your own native arm64 runner to GitHub actions, or some third party commercial service which offers arm64 runners and an easy way to bootstrap them into GitHub Actions (BuildJet, Depot.... etc). Or wait for https://github.com/actions/runner-images/issues/5631 :-)

chadlwilson avatar Aug 23 '23 15:08 chadlwilson

I tried all approaches, even building on an arm64 machine without buildx or QEMU. However, I kept receiving a timeout error.

SarveshMishra avatar Sep 16 '23 07:09 SarveshMishra

Config that worked for me to build the arm64 with yarn on GitHub actions:

Add this to your Dockerfile before RUN yarn install :

RUN yarn config set network-timeout 300000
RUN apk add g++ make py3-pip
RUN yarn global add node-gyp

RUN yarn install
  1. Increase the network timeout set network-timeout 300000
  2. Add additional dependencies
  3. Install node-gyp

andrii33 avatar Oct 02 '23 23:10 andrii33

It also stuck in #14 [dependencies 2/5] RUN yarn config set network-timeout 300000 for node:20-alpine and linux/arm/v7. Super weird!

mohsenasm avatar Oct 26 '23 12:10 mohsenasm

Same here. Less than a minute locally vs. 8 minutes on average running in a Docker build step on GitHub Actions. I confirm we are building an arm64 image with Docker buildx official action.

Also, no way to get it working without using --network-timeout 100000.

39otrebla avatar Nov 06 '23 23:11 39otrebla

@mohsenasm I agree with you. Look's like this effects node:20 (lts). I have downgraded my dockerfile to node:18-alpine3.19 and yarn install now works again.

clowa avatar Jan 06 '24 21:01 clowa

As per the linked PRs, increase your network timeout and be extremely patient if running under buildx/QEMU via GitHub Actions. :-)

I would have NEVER imagined such a difference in build time on a Xeon V4 server:

amd64

  • npm install --> added 1285 packages in 2m
  • npm build --> built in 29.61s

arm64 w/ QEMU

  • npm install --> added 1285 packages in 28m
  • npm build --> built in 21m 59s

This is (or should be) unacceptable for any standards.

One way to approach this is by splitting into build stages, where the npm install and build are done only on native auth and the final image is only created with multi arch --platform.

razvanphp avatar Mar 01 '24 08:03 razvanphp