docker-node icon indicating copy to clipboard operation
docker-node copied to clipboard

ci: add multiarch testing

Open ttshivers opened this issue 3 years ago • 11 comments

Adds multiarch testing on the github actions.

Some tests do fail, because they are failing in production too, like the alpine3.12 builds and the nodejs 12-14 i386 builds. I know i386 isn't officially supported anymore in those versions, so perhaps allows those failures is in order?

In writing the Github actions, I realized that all the versions and variants could be written into one big testing matrix (since they can be dynamic) in one file if that's desirable rather than having almost duplicated files for each version x variant. I could also write up a proof of concept PR for what that would be like.

ttshivers avatar Sep 23 '20 03:09 ttshivers

There seems to be some false build failures with buster/buster-slim only on arm32v7 with some ssl cert errors like:

#7 99.25 + curl -fsSLO --compressed https://nodejs.org/dist/v14.11.0/node-v14.11.0-linux-armv7l.tar.xz
#7 99.49 curl: (60) SSL certificate problem: unable to get local issuer certificate
#7 99.50 More details here: https://curl.haxx.se/docs/sslcerts.html
#7 99.50 
#7 99.50 curl failed to verify the legitimacy of the server and therefore could not
#7 99.50 establish a secure connection to it. To learn more about this situation and
#7 99.50 how to fix it, please visit the web page mentioned above.

I am unsure why they are failing here but not in production but will investigate.

ttshivers avatar Sep 23 '20 04:09 ttshivers

Thinking about how to make this even cleaner:

One possibility is to change the update.sh to generate a single action file that has one big matrix like:

{
  "include": [
    { "version": "10",  "variant": "alpine3.11", "arch": "amd64" },
    { "version": "10",  "variant": "alpine3.11", "arch": "i386" },
  ]
}

It's also possible to replace all the generated test actions with a single file that automatically detects the versions, variants, and arches supported and generates a dynamic matrix at runtime so it doesn't have to be hardcoded and updated manually when any of those change.

ttshivers avatar Sep 24 '20 02:09 ttshivers

Just restarting the failing jobs now. I think the approach is interesting, but I do have a concern on resource exhaustion on the 100+ jobs. I'm not sure if the 20 job limit is per repo or per org

I am unsure of how those limits apply too. I can't find any documentation easily that clarifies this, so I'll reach out to GitHub and ask them about how those limits apply.

Edit: Found a forum post stating "You can execute up to 20 workflows concurrently per repository"

ttshivers avatar Sep 24 '20 02:09 ttshivers

One possibility is to change the update.sh to generate a single action file that has one big matrix like:

I originally split them over to separate files so that jobs only get triggered when a particular job has changed. If you think that could be dynamic that might be interesting though

nschonni avatar Sep 24 '20 02:09 nschonni

One possibility is to change the update.sh to generate a single action file that has one big matrix like:

I originally split them over to separate files so that jobs only get triggered when a particular job has changed. If you think that could be dynamic that might be interesting though

I think I could make it work that way by manually examining the list of changed files in the metadata and building the matrix dynamically based on what images changed. If no relevant files were changed, it would just not populate the matrix with any entries. I'd likely write up this dynamic runtime matrix generation logic in some nodejs script since it's easy to run and integrate with GitHub actions and bash is quite annoying. I think this option seems promising and not too difficult to do, especially since part of the logic is already done in the existing bash scripts.

Another option or combination is to take advantage of github action caching and docker caching which should use layers from previous runs that haven't changed.

ttshivers avatar Sep 24 '20 03:09 ttshivers

You could also consider using Travis just for multi-arch testing (https://docs.travis-ci.com/user/multi-cpu-architectures/), which would allow the builds to be native instead of emulated.

tianon avatar Sep 24 '20 16:09 tianon

You could also consider using Travis just for multi-arch testing (https://docs.travis-ci.com/user/multi-cpu-architectures/), which would allow the builds to be native instead of emulated.

Natively building would certainly be much faster. It supports a few architectures but is missing: arm32v6, arm32v7, i386.

ttshivers avatar Sep 25 '20 01:09 ttshivers

Both those arm32 arches can build on the graviton instances, and i386 can build on any amd64.

tianon avatar Sep 25 '20 01:09 tianon

Both those arm32 arches can build on the graviton instances, and i386 can build on any amd64.

Ah ok. That's certainly a valid option then. One note is that testing was moved from Travis CI in #1194

ttshivers avatar Sep 25 '20 01:09 ttshivers

Now that the generated build matrix landed, does this get easier?

nschonni avatar Oct 16 '20 03:10 nschonni

Now that the generated build matrix landed, does this get easier?

Yeah. I think it will be significantly simplified. Also, the logic I've done in https://github.com/nodejs/docker-node/pull/1341 will help with architecture parsing too.

ttshivers avatar Oct 16 '20 03:10 ttshivers