gloo icon indicating copy to clipboard operation
gloo copied to clipboard

Apple Silicon Support for local development

Open kevin-shelaga opened this issue 4 years ago • 17 comments

Is your feature request related to a problem? Please describe. Apple Silicon Support for local development

Currently the gateway proxy fails to start and crashloops

[2021-10-08 23:04:50.803][10][critical][assert] [external/envoy/source/common/signal/http://signal_action.cc:62] assert failure: sigaltstack(&stack, &previous_altstack_) == 0.

Describe the solution you'd like A clear and concise description of what you want to happen.

Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.

Additional context Add any other context or screenshots about the feature request here.

kevin-shelaga avatar Oct 13 '21 21:10 kevin-shelaga

K3d does not work on M1 chips

chrisgaun avatar Oct 15 '21 15:10 chrisgaun

Docker desktop for apple silicon 4.3.0 seems to fix this. Closing.

kevin-shelaga avatar Dec 03 '21 20:12 kevin-shelaga

reopening as local devs onboarding are continuing to hit issues here (e.g. build our test assets and kind load for example.

the mesh team has already solved a lot of this pain and we can copy a lot of their solution

kdorosh avatar Mar 28 '22 18:03 kdorosh

Do you think publishing arm builds of glooctl would be an example that would fall under this umbrella or is the definition of done here scoped to having workarounds for any / every common dev task.

nfuden avatar Mar 28 '22 18:03 nfuden

The immediate concern is to unblock new hires onboarding to the team, but I wouldn't consider this issue done until we handle both cases

kdorosh avatar Mar 28 '22 20:03 kdorosh

Unfortunately the issue is a bit different than mesh because of how gloo edge builds it's proxy container. In mesh we were lucky enough to not have to build off of any x86 images. Therefore I think we should probably open an issue in envoy-gloo to also build/release arm builds from there.

EItanya avatar Mar 28 '22 22:03 EItanya

I'm curious what issues people are running into with M1 macs. I've been building/running x86 and arm based containers without issue on my M1 Max.

kevin-shelaga avatar Mar 29 '22 00:03 kevin-shelaga

I too would appreciate extra clarity here, but I believe it mostly pertains to building release assets locally and using them in local testing. Some extra thoughts can be found here https://soloio.slab.com/posts/m-1-local-development-intro-il9oevhq

kdorosh avatar Mar 29 '22 02:03 kdorosh

When building the docker images in solo-projects for M1 chips please reference changes made in the following solo-projects branch M1-fix-for-docker-image.

  • Added `--platform=linux/amd64" to from statements in go
    • Solved error for -m64 not defined.

jackstine avatar Apr 12 '22 19:04 jackstine

  1. envoy-gloo-ee
    1. Update the image in cloudbuilders
    2. Update the build image envoyproxy/envoy-build-ubuntu so that it is built in arm64. This image will be used here
    3. we will need a built image in arm64 of gcr.io/$PROJECT_ID/envoy-build-ubuntu
    4. Need to update the based off arch, the build path ./linux/amd64/build_envoy_release_stripped/envoy it will be arm64. So make this dynamic
    5. we need an arm64 version of frolvlad/alpine-glibc currently none exist, so we will have to build one here this will apply to the following file
      1. This does throw errors when building, but finishes
    6. update the ENOVY-IMAGE tags in s-p to allow for both arm64 and amd64
  2. envoy-gloo
    1. update the build process similar to the cloudbuild.yaml
    2. update the image frolvlad/alpine-glibc
    3. make the build path dynamic, similar to envoy-gloo-ee

jackstine avatar Apr 28 '22 20:04 jackstine

Where are we going to actually build the binaries, that will determine the majority of the work. Does GCP support ARM now?

EItanya avatar Apr 28 '22 20:04 EItanya

No does not look like it. All the compute listed here are not ARM. I haven't found any from GCP. They might announce it soon Google I/O is in May?

Also building the base image frolvlad/alpine-glibc causes problems. Aaron and I have found a work around using ubuntu as the base image for now.

jackstine avatar Apr 28 '22 23:04 jackstine

Here are a few things to do for this epic

  • [ ] Find suitable image replacement for envoy-gloo base image frolvlad/alpine-glibc to support ARM
  • [ ] Replace all images in CI build workflow to support ARM, this can be done with simply building them on an ARM machine
  • [ ] CI build workflow
  • [x] Fix tests in s-p with supported ARM images.
  • [x] Some images used in testing are not supported on ARM. IE http-echo
  • [x] Fix tests in gloo with supported ARM images.
  • [ ] Add support for Gloo-Fed
  • [ ] Add support for FIPS
  • [ ] Add support for Ext-Auth Plugin image
  • [ ] Fix e2e and regression tests

jackstine avatar May 03 '22 15:05 jackstine

Here are a list of outstanding fails in s-p that occur

  • KUBE2E_TESTS=gateway make run-ci-regression-tests 1#
[Fail] Installing gloo in gateway mode [It] can route request to upstream

2#

dlp tests xslt transformer [It] will transform xml -> json

 Message: "admission webhook \"gloo.gloo-system.svc\" denied the request: resource incompatible with current Gloo snapshot: [Validating v1.VirtualService failed: validating *v1.VirtualService name:\"vs\" namespace:\"gloo-system\": failed to validate Proxy with Gloo validation server: VirtualHost Error: ProcessingError. Reason: invalid virtual host [gloo-system_vs]: envoy validation mode output: Caught Segmentation fault, suspect faulting address 0xd0\nBacktrace (use tools/stack_decode.py to get line numbers):\nEnvoy version: e81851c7ba191e99ad4a9e13dfea1f7af42b7323/1.21.1/Distribution/RELEASE/BoringSSL\n#0: [0x4005f6c930]\n, error: signal: segmentation fault]",
  • almost all e2e tests

resource for BoringSSL

jackstine avatar May 05 '22 17:05 jackstine

Here are a list of outstanding fails in gloo that occur.

  • KUBE2E_TESTS=glooctl make run-ci-regression-tests
"[_output/glooctl-darwin-amd64 istio uninject --namespace gloo-system --include-upstreams true] failed: Error: istio uninject can only be run when both the sds and istio-proxy sidecars are present on the gateway-proxy pod\n",
  • KUBE2E_TESTS=helm make run-ci-regression-tests
E0505 13:45:50.362259   96093 portforward.go:406] an error occurred forwarding 60906 -> 9091: error forwarding port 9091 to pod 0ee8e81cc58c36b97883e10adb77cb18ab102a48c326134ee01eb0e57d69a50b, uid : failed to execute portforward in network namespace "/var/run/netns/cni-1a6947dd-c169-dc99-5e6c-29460769424f": failed to connect to localhost:9091 inside namespace "0ee8e81cc58c36b97883e10adb77cb18ab102a48c326134ee01eb0e57d69a50b", IPv4: dial tcp4 127.0.0.1:9091: connect: connection refused IPv6 dial tcp6 [::1]:9091: connect: connection refused 
E0505 13:45:50.362646   96093 portforward.go:234] lost connection to pod

jackstine avatar May 05 '22 18:05 jackstine

  1. envoy-gloo-ee

    1. Update the image in cloudbuilders

    2. Update the build image envoyproxy/envoy-build-ubuntu so that it is built in arm64. This image will be used here

    3. we will need a built image in arm64 of gcr.io/$PROJECT_ID/envoy-build-ubuntu

    4. Need to update the based off arch, the build path ./linux/amd64/build_envoy_release_stripped/envoy it will be arm64. So make this dynamic

    5. we need an arm64 version of frolvlad/alpine-glibc currently none exist, so we will have to build one here this will apply to the following file

      1. This does throw errors when building, but finishes
    6. update the ENOVY-IMAGE tags in s-p to allow for both arm64 and amd64

  2. envoy-gloo

    1. update the build process similar to the cloudbuild.yaml
    2. update the image frolvlad/alpine-glibc
    3. make the build path dynamic, similar to envoy-gloo-ee
  1. envoy-gloo-ee

    1. Update the image in cloudbuilders

    2. Update the build image envoyproxy/envoy-build-ubuntu so that it is built in arm64. This image will be used here

    3. we will need a built image in arm64 of gcr.io/$PROJECT_ID/envoy-build-ubuntu

    4. Need to update the based off arch, the build path ./linux/amd64/build_envoy_release_stripped/envoy it will be arm64. So make this dynamic

    5. we need an arm64 version of frolvlad/alpine-glibc currently none exist, so we will have to build one here this will apply to the following file

      1. This does throw errors when building, but finishes
    6. update the ENOVY-IMAGE tags in s-p to allow for both arm64 and amd64

  2. envoy-gloo

    1. update the build process similar to the cloudbuild.yaml
    2. update the image frolvlad/alpine-glibc
    3. make the build path dynamic, similar to envoy-gloo-ee

your links are all broken..

fabioaraujopt avatar Jul 11 '22 15:07 fabioaraujopt

Adding outstanding work on this. This has popped up in the most recent weeks.

jackstine avatar Aug 01 '22 17:08 jackstine

Team meeting to summarize current issues - https://docs.google.com/document/d/16HZRkq-y3sq7olz8WCoYc2phdUMjyTraTh2R5fBq2Bo/edit?usp=sharing

ianmacclancy avatar Aug 17 '22 18:08 ianmacclancy

The linked PRs have merged into the main branches on Gloo OSS and Gloo Enterprise. Now developers should be able to follow the same local development process, regardless of their machine. I'm moving this back to backlog, as the additional Apple Silicon support for all images will require further effort but is not something that is needed right now.

sam-heilbron avatar Apr 19 '23 14:04 sam-heilbron