node-feature-discovery icon indicating copy to clipboard operation
node-feature-discovery copied to clipboard

Add support for arm64, armV7, armV8

Open utegental opened this issue 4 years ago • 43 comments

What would you like to be added: There is no arm support. It would be good to see support this for those architectures.

Why is this needed: Getting "standard_init_linux.go:219: exec user process caused: exec format error" for pods scheduled on arm nodes. Usually it's easy to do - just making image multi-architecture =)

utegental avatar Jan 21 '21 09:01 utegental

I agree. There have been attempts in the past e.g. #203 and #327 at least. Now that we have K8s test-infra (gcb as the image builder) as the CI pipeline this might be doable

marquiz avatar Jan 21 '21 14:01 marquiz

I would be interested in this as well. Can you elaborate on what kind of work is required to make this happen? From the previous PRs, it doesn't seem that a great deal of code is required.

Diaoul avatar Feb 27 '21 15:02 Diaoul

One is currently being maintained in parity with upstream docker.io/raspbernetes/node-feature-discovery which is built for armv7 & arm64

xunholy avatar Mar 14 '21 11:03 xunholy

Great! Maybe this can be merged upstream?

Diaoul avatar Mar 14 '21 11:03 Diaoul

@marquiz We can easily do this with github actions

jobs:
  build_job:
    runs-on: ubuntu-latest
    name: Build on ${{ matrix.arch }}

    strategy:
      matrix:
        include:
          - arch: armv7
            distro: ubuntu20.04
          - arch: aarch64
            distro: ubuntu20.04
          - arch: s390x
            distro: ubuntu20.04
          - arch: ppc64le
            distro: ubuntu20.04
    steps:

zvonkok avatar Mar 16 '21 09:03 zvonkok

The only question is does gcr.io support manifest lists?

zvonkok avatar Mar 16 '21 09:03 zvonkok

@marquiz We can easily do this with github actions

Yeah, sure. But we use prow for building images

marquiz avatar Mar 16 '21 10:03 marquiz

Any updates on this?

anthr76 avatar Jun 13 '21 13:06 anthr76

@marquiz Are there any plans to progress with this issue? It looks like the main problem is not in updating the Makefile. This change is quite simple if buildx can be used. But, before creating a PR the CI pipelines need to be updated to support buildx. Do you know if it's possible and what help is required?

anta5010 avatar Jun 30 '21 12:06 anta5010

@marquiz Are there any plans to progress with this issue? It looks like the main problem is not in updating the Makefile. This change is quite simple if buildx can be used. But, before creating a PR the CI pipelines need to be updated to support buildx. Do you know if it's possible and what help is required?

Hmm, kubernetes test-infra might actually support docker buildx nowadays so we could do this. What needs to be done is

  • update Makefile
  • (probably) update scripts/test-infra/* to build images on all architectures
  • update docs (docs/get-started/deployment-and-usage.md and docs/advanced/developer-guide.md quickly come to mind)

Patches are welcome 😄

marquiz avatar Jul 07 '21 11:07 marquiz

Best example I could find is https://github.com/kubernetes-sigs/service-catalog/blob/master/Makefile#L343 but I am not big fan of cross compilation tho, so the question is, do we want to do it like that?

ArangoGutierrez avatar Aug 05 '21 01:08 ArangoGutierrez

https://kubernetes.slack.com/archives/C01A82FKSQK/p1628088075000500

ArangoGutierrez avatar Aug 05 '21 01:08 ArangoGutierrez

I am not big fan of cross compilation tho, so the question is, do we want to do it like that?

With docker buildx you shouldn't need that. Just something like docker buildx build --platform linux/arm64 . (haven't experimented this myself, though). Thus, I think we should be ok with relatively simple modifications to the Makefile

marquiz avatar Aug 09 '21 11:08 marquiz

https://github.com/kubernetes/test-infra/pull/22977 enables docker buildx on prow

ArangoGutierrez avatar Aug 09 '21 21:08 ArangoGutierrez

We need to watch the output of

  • https://github.com/kubernetes/test-infra/issues/16588 And wait for PR
  • https://github.com/kubernetes/test-infra/pull/23044 to be merged, once docker buildx support is fully supported, we can make the needed modifications to our Makefile

ArangoGutierrez avatar Aug 09 '21 22:08 ArangoGutierrez

https://hub.docker.com/r/raspbernetes/node-feature-discovery - these guys have done multi-arch support several months ago. I don't know how exactly that was done, but this container works without issues.

utegental avatar Aug 10 '21 05:08 utegental

I don't know how exactly that was done, but this container works without issues.

I think they use docker buildx: https://github.com/raspbernetes/multi-arch-images

marquiz avatar Aug 10 '21 06:08 marquiz

https://hub.docker.com/r/raspbernetes/node-feature-discovery is not a kubernetes-sigs repo, using prow, I think the main idea is to use prow as our official image builder right?

ArangoGutierrez avatar Aug 10 '21 13:08 ArangoGutierrez

If prow isn't sufficient to emulate other arch's and there isn't infra to natively build on other arch's why use it? Is there a requirement? I know of other sigs building multi-arch.

Other arch's exist and this helps adoption greatly <3

anthr76 avatar Aug 10 '21 21:08 anthr76

I think the main idea is to use prow as our official image builder right?

Yes. We use k8s test infra (prow) for CI and we're staying there as that tightly coupled with k8s container image hosting (k8s.gcr.io). Moreover, we want to serve all the container images from a single registry.

If prow isn't sufficient to emulate other arch's and there isn't infra to natively build on other arch's why use it? Is there a requirement? I know of other sigs building multi-arch.

I'm a bit lost here 🧐 As I commented earler, afaiu prow nowadays supports multiarch builds via docker buildx. And afaiu, Raspbernetes uses the same build method.

We just need a PR in NFD for enabling multiarch builds with docker buildx

marquiz avatar Aug 11 '21 06:08 marquiz

+1

This would be really handy to discover features on Jetson Xaviers

kmhaeren avatar Aug 17 '21 14:08 kmhaeren

FWIW, I build (local) multiarch (arm64/amd64) docker images of NFS using this commandline:

$ git checkout v0.9.0
$ make IMAGE_BUILD_CMD="docker buildx build --push --platform linux/amd64,linux/arm64" IMAGE_REGISTRY="docker.io/jonkerj"

Should not be too difficult to patch up the makefile to do this by default. Maybe it needs some var containing the supported platforms, it's a matter of taste.

jonkerj avatar Aug 23 '21 11:08 jonkerj

@jonkerj running your command returns

 Done setting up docker in docker.
+ WRAPPED_COMMAND_PID=174
+ wait 174
+ scripts/test-infra/build-image.sh
namespace: node-feature-discovery
image: k8s.gcr.io/nfd/node-feature-discovery:v0.10.0-devel-2-g58e147d
docker buildx build --platform linux/amd64,linux/arm64 --build-arg VERSION=v0.10.0-devel-2-g58e147d \
    --target full \
    --build-arg HOSTMOUNT_PREFIX=/host- \
    --build-arg BASE_IMAGE_FULL=debian:buster-slim \
    --build-arg BASE_IMAGE_MINIMAL=gcr.io/distroless/base \
    -t k8s.gcr.io/nfd/node-feature-discovery:v0.10.0-devel-2-g58e147d \
     \
     ./
error: multiple platforms feature is currently not supported for docker driver. Please switch to a different driver (eg. "docker buildx create --use")
make: *** [Makefile:70: image] Error 1 

ArangoGutierrez avatar Aug 23 '21 12:08 ArangoGutierrez

After splitting out the build by platform, is seems to work ok -> https://github.com/kubernetes-sigs/node-feature-discovery/pull/582/commits/cbd42af2b31c08545180dc456c8d256dba404972

ArangoGutierrez avatar Aug 23 '21 12:08 ArangoGutierrez

It could be a feature/limitation of your docker environment or buildx initialization, above invocation works for me (docker 20.10.8 / buildx v0.6.1 / moby/buildkit:buildx-stable-1)

jonkerj avatar Aug 23 '21 12:08 jonkerj

The Raspbernetes project automatically builds multi-arch images of several projects, including NFD. You could take a look at the workflow file, which essentially wraps docker buildx build

jonkerj avatar Aug 23 '21 13:08 jonkerj

Is not "my" environment, is kubernetes/test-infra prow environment, the one we use for all NFD CI and image building

ArangoGutierrez avatar Aug 23 '21 18:08 ArangoGutierrez

@ArangoGutierrez thanks for working on this! Where did you see that error? I'm looking at #582 and only see a mock-related error that should be easy to fix

marquiz avatar Aug 23 '21 18:08 marquiz

The first run failed, then I sort of fixed it and amended the commit

ArangoGutierrez avatar Aug 23 '21 18:08 ArangoGutierrez

Has there been any progress on this?

gclawes avatar Oct 22 '21 00:10 gclawes