quarkus icon indicating copy to clipboard operation
quarkus copied to clipboard

Native executable quarkus-amazon-lambda images built on Mac M1 Pro not working on AWS

Open bredlej opened this issue 1 year ago • 4 comments

Describe the bug

Im struggling since yesterday to invoke a AWS Lambda using a native image built on a Mac laptop with the M1 Pro processor. Tried out with different Quarkus extensions like quarkus-amazon-lambda, quarkus-amazon-lambda-http and quarkus-funqy-amazon-lambda.

This bug description will use quarkus-amazon-lambda as an example of the behavior.

Expected behavior

Expect to be able to invoke the lambda either from AWS Console, or using local tools like ./target/manage.sh native invoke or sam local invoke --template target/sam.native.yaml --event payload.json

Actual behavior

Image built with the command quarkus build --native --no-tests -Dquarkus.native.container-build=true and Lambda created with ./target/manage.sh native create. When invoking the Lambda either from AWS Console or ./target/manage.sh native invoke results in following error:

Invoking function
++ aws lambda invoke response.txt --cli-binary-format raw-in-base64-out --function-name AwsLambdaTestNative --payload file://payload.json --log-type Tail --query LogResult --output text
++ base64 --decode
INIT_REPORT Init Duration: 4.33 ms	Phase: init	Status: error	Error Type: Runtime.InvalidEntrypoint
INIT_REPORT Init Duration: 2.56 ms	Phase: invoke	Status: error	Error Type: Runtime.InvalidEntrypoint
START RequestId: 3fe9e6da-5fd9-4b8a-8273-ff0a0288b3fe Version: $LATEST
RequestId: 3fe9e6da-5fd9-4b8a-8273-ff0a0288b3fe Error: fork/exec /var/task/bootstrap: exec format error
Runtime.InvalidEntrypoint
END RequestId: 3fe9e6da-5fd9-4b8a-8273-ff0a0288b3fe
REPORT RequestId: 3fe9e6da-5fd9-4b8a-8273-ff0a0288b3fe	Duration: 14.15 ms	Billed Duration: 15 ms	Memory Size: 256 MB	Max Memory Used: 3 MB
{"errorType":"Runtime.InvalidEntrypoint","errorMessage":"RequestId: 3fe9e6da-5fd9-4b8a-8273-ff0a0288b3fe Error: fork/exec /var/task/bootstrap: exec format error"}%

Running sam local invoke --template target/sam.native.yaml --event payload.json results in:

Invoking not.used.in.provided.runtime (provided)
Decompressing /Users/bredlej/Coding/java/aws/aws-lambda-test/target/function.zip
Local image is up-to-date
Using local image: public.ecr.aws/lambda/provided:alami-rapid-x86_64.

Mounting /private/var/folders/l4/ljjl5xms541dgwsynf8ykt300000gn/T/tmpm9quhj8k as /var/task:ro,delegated, inside runtime container
START RequestId: 7a3b9c76-fdcd-47c5-8d59-d6e6d95e4244 Version: $LATEST
09 Feb 2024 15:50:08,774 [ERROR] (rapid) Init failed error=fork/exec /var/task/bootstrap: no such file or directory InvokeID=
09 Feb 2024 15:50:08,783 [ERROR] (rapid) Invoke failed error=fork/exec /var/task/bootstrap: no such file or directory InvokeID=709a18a7-1132-4ba0-8347-a53377c33edb
09 Feb 2024 15:50:08,787 [ERROR] (rapid) Invoke DONE failed: Runtime.InvalidEntrypoint

Error: 'content-type'
Traceback:
  File "click/core.py", line 1078, in main
  File "click/core.py", line 1688, in invoke
  File "click/core.py", line 1688, in invoke
  File "click/core.py", line 1434, in invoke
  File "click/core.py", line 783, in invoke
  File "samcli/cli/cli_config_file.py", line 347, in wrapper
  File "click/decorators.py", line 92, in new_func
  File "click/core.py", line 783, in invoke
  File "samcli/lib/telemetry/metric.py", line 185, in wrapped
  File "samcli/lib/telemetry/metric.py", line 150, in wrapped
  File "samcli/lib/utils/version_checker.py", line 43, in wrapped
  File "samcli/cli/main.py", line 95, in wrapper
  File "samcli/commands/local/invoke/cli.py", line 104, in cli
  File "samcli/commands/local/invoke/cli.py", line 205, in do_cli
  File "samcli/commands/local/lib/local_lambda.py", line 147, in invoke
  File "samcli/lib/telemetry/metric.py", line 325, in wrapped_func
  File "samcli/local/lambdafn/runtime.py", line 213, in invoke
  File "samcli/local/docker/container.py", line 433, in wait_for_result
  File "samcli/lib/utils/retry.py", line 31, in wrapper
  File "samcli/local/docker/container.py", line 405, in wait_for_http_response
  File "requests/structures.py", line 52, in __getitem__

An unexpected error was encountered while executing "sam local invoke".
Search for an existing issue:
https://github.com/aws/aws-sam-cli/issues?q=is%3Aissue+is%3Aopen+Bug%3A%20sam%20local%20invoke%20-%20KeyError
Or create a bug report:
https://github.com/aws/aws-sam-cli/issues/new?template=Bug_report.md&title=Bug%3A%20sam%20local%20invoke%20-%20KeyError
Function 'AwsLambdaTestNative' timed out after 15 seconds

How to Reproduce?

  1. Create a Quarkus project in code.quarkus.io with only quarkus-amazon-lambda extension selected.
  2. In the terminal go inside the downloaded Quarkus projects folder
  3. Run quarkus build --native --no-tests -Dquarkus.native.container-build=true
  4. Edit ./target/manage.sh and change the RUNTIME environment value to java17 (or java21)
  5. Run ./target/manage.sh native create
  6. Run ./target/manage.sh native invoke or
  7. Run sam local invoke --template target/sam.native.yaml --event payload.json or
  8. Login to AWS Console/Lambda/Functions, select the newly created function and run a test with the following json:
{
  "name" : "Bred"
}

Output of uname -a or ver

Darwin MacBook-Pro-2 23.2.0 Darwin Kernel Version 23.2.0: Wed Nov 15 21:55:06 PST 2023; root:xnu-10002.61.3~2/RELEASE_ARM64_T6020 arm64

Output of java -version

openjdk version "21" 2023-09-19 OpenJDK Runtime Environment (build 21+35-2513) OpenJDK 64-Bit Server VM (build 21+35-2513, mixed mode, sharing)

Mandrel or GraalVM version (if different from Java)

MANDREL 23.1.2.0

Quarkus version or git rev

3.7.2

Build tool (ie. output of mvnw --version or gradlew --version)

Apache Maven 3.9.6 (bc0240f3c744dd6b6ec2920b3cd08dcc295161ae) Maven home: /Users/bredlej/.m2/wrapper/dists/apache-maven-3.9.6-bin/3311e1d4/apache-maven-3.9.6 Java version: 21, vendor: Oracle Corporation, runtime: /Users/bredlej/Library/Java/JavaVirtualMachines/openjdk-21/Contents/Home Default locale: pl_PL, platform encoding: UTF-8 OS name: "mac os x", version: "14.2.1", arch: "aarch64", family: "mac"

Additional information

Note that quarkus build generates a manage.sh file with RUNTIME=java11 which causes problems with class loading and you have to manually set it to java17 or java21

When doing a jvm build of the same project with

  1. quarkus build
  2. ./target/manage.sh create
  3. ./target/manage.sh invoke

The lambda works fine:

Invoking function
++ aws lambda invoke response.txt --cli-binary-format raw-in-base64-out --function-name AwsLambdaTest --payload file://payload.json --log-type Tail --query LogResult --output text
++ base64 --decode
__  ____  __  _____   ___  __ ____  ______
--/ __ \/ / / / _ | / _ \/ //_/ / / / __/
-/ /_/ / /_/ / __ |/ , _/ ,< / /_/ /\ \
--\___\_\____/_/ |_/_/|_/_/|_|\____/___/
2024-02-09 15:36:36,768 INFO  [io.quarkus] (main) aws-lambda-test 1.0.0-SNAPSHOT on JVM (powered by Quarkus 3.7.2) started in 1.068s.
2024-02-09 15:36:36,770 INFO  [io.quarkus] (main) Profile prod activated.
2024-02-09 15:36:36,771 INFO  [io.quarkus] (main) Installed features: [amazon-lambda, cdi]
START RequestId: 127d995b-b4bc-4e65-b9b0-9704d528eec1 Version: $LATEST
END RequestId: 127d995b-b4bc-4e65-b9b0-9704d528eec1
REPORT RequestId: 127d995b-b4bc-4e65-b9b0-9704d528eec1	Duration: 142.29 ms	Billed Duration: 143 ms	Memory Size: 256 MB	Max Memory Used: 129 MB	Init Duration: 1390.00 ms
"Hello Bill"%

Additionally here's my sam --info result:

{
  "version": "1.109.0",
  "system": {
    "python": "3.8.13",
    "os": "macOS-14.2.1-arm64-arm-64bit"
  },
  "additional_dependencies": {
    "docker_engine": "20.10.22",
    "aws_cdk": "Not available",
    "terraform": "Not available"
  },
  "available_beta_feature_env_vars": [
    "SAM_CLI_BETA_FEATURES",
    "SAM_CLI_BETA_BUILD_PERFORMANCE",
    "SAM_CLI_BETA_TERRAFORM_SUPPORT",
    "SAM_CLI_BETA_RUST_CARGO_LAMBDA"
  ]
}

It's possible I don't understand something with the native builds and creating AWS Lambda functions - if yes, kindly explain what I'm doing wrong.

bredlej avatar Feb 09 '24 16:02 bredlej

/cc @Karm (mandrel), @galderz (mandrel), @gastaldi (m1), @matejvasek (amazon-lambda,funqy), @patriot1burke (amazon-lambda,funqy), @zakkak (mandrel,native-image)

quarkus-bot[bot] avatar Feb 09 '24 16:02 quarkus-bot[bot]

@bredlej judging by the error message:

Error: fork/exec /var/task/bootstrap: exec format error

I suspect that what's happening is that you are building a native image targeting your Mac's AArch64 architecture and then you try to run it on AWS on an x86 machine.

Can you please run the following on your Mac?:

file <path-to-the-generated-binary>

And also report what AWS instance you are using for running the lambdas?

zakkak avatar Feb 12 '24 10:02 zakkak

@zakkak Thanks for responding!

You're right, the image is the following: target/aws-lambda-test-1.0.0-SNAPSHOT-runner: ELF 64-bit LSB executable, ARM aarch64, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux-aarch64.so.1, for GNU/Linux 3.7.0, BuildID[sha1]=d6277eaab2d4c70f1e5b1cec62615cb5bf169538, not stripped

In the weekend I read about doing multistage docker builds, so I managed to do a workaround by doing the following:

  1. Create a Dockerfile.multistage file as described at https://quarkus.io/guides/building-native-image#multistage-docker
  2. Run docker -D build -f src/main/docker/Dockerfile.multistage -t aws/aws-lambda-test-multistage .
  3. Create repository on AWS aws ecr create-repository --repository-name <repository-name>
  4. Tag the docker build: docker tag <docker build sha> <aws-account-nr>.dkr.ecr.<aws-region>.amazonaws.com/<repository-name>:v1
  5. Login to AWS: aws ecr get-login-password --region <aws-region> | docker login --username AWS --password-stdin <aws-account-nr>.dkr.ecr.<aws-region>.amazonaws.com
  6. docker push <aws-account-nr>.dkr.ecr.<aws-region>.amazonaws.com/<repository-name>:v1
  7. Go to AWS Management Console / Lambda / Functions -> Create function...
  8. Chose to create from Container Image and architecture arm64

Today I'm trying to get it working from the CLI by adjusting the ./target/manage.shscript, but I'm getting nowhere yet. Basically I tried to build a native image with: quarkus build --native --no-tests -Dquarkus.native.container-build=true and edting the ./target/manage.sh by adding --architectures arm64 to the aws lambda create-function call. I had to comment out RUNTIME=provided from line 73 because it said it's not compatible with arm64 for some reason.

But still no success:

> ./target/manage.sh native invoke

Invoking function
++ aws lambda invoke response.txt --cli-binary-format raw-in-base64-out --function-name AwsLambdaTestNative --payload file://payload.json --log-type Tail --query LogResult --output text
++ base64 --decode
START RequestId: f13655f3-45c8-423f-9e55-99012cf5a129 Version: $LATEST
Class not found: io.quarkus.amazon.lambda.runtime.QuarkusStreamHandler: java.lang.ClassNotFoundException
java.lang.ClassNotFoundException: io.quarkus.amazon.lambda.runtime.QuarkusStreamHandler
	at java.base/java.net.URLClassLoader.findClass(Unknown Source)
	at java.base/java.lang.ClassLoader.loadClass(Unknown Source)
	at java.base/java.lang.ClassLoader.loadClass(Unknown Source)
	at java.base/java.lang.Class.forName0(Native Method)
	at java.base/java.lang.Class.forName(Unknown Source)
	at java.base/java.lang.Class.forName(Unknown Source)

END RequestId: f13655f3-45c8-423f-9e55-99012cf5a129
REPORT RequestId: f13655f3-45c8-423f-9e55-99012cf5a129	Duration: 583.06 ms	Billed Duration: 584 ms	Memory Size: 256 MB	Max Memory Used: 95 MB	Init Duration: 264.63 ms
{"errorMessage":"Class not found: io.quarkus.amazon.lambda.runtime.QuarkusStreamHandler","errorType":"java.lang.ClassNotFoundException"}%

So while the multistage docker build approach works, I wonder it's still possible somehow to get it done via the manage.sh script.

Do you have perhaps any ideas what to try out?

EDIT:

To add to this - the same build works if not done natively:

  1. quarkus build
  2. ./target/manage.sh create
  3. ./target/manage.sh invoke

Result:

Invoking function
++ aws lambda invoke response.txt --cli-binary-format raw-in-base64-out --function-name AwsLambdaTest --payload file://payload.json --log-type Tail --query LogResult --output text
++ base64 --decode
__  ____  __  _____   ___  __ ____  ______
--/ __ \/ / / / _ | / _ \/ //_/ / / / __/
-/ /_/ / /_/ / __ |/ , _/ ,< / /_/ /\ \
--\___\_\____/_/ |_/_/|_/_/|_|\____/___/
2024-02-12 12:45:48,977 INFO  [io.quarkus] (main) aws-lambda-test 1.0.0-SNAPSHOT on JVM (powered by Quarkus 3.7.2) started in 1.105s.
2024-02-12 12:45:48,978 INFO  [io.quarkus] (main) Profile prod activated.
2024-02-12 12:45:48,979 INFO  [io.quarkus] (main) Installed features: [amazon-lambda, cdi]
START RequestId: 0ac72d5f-ba0a-4604-868e-0f4179eae053 Version: $LATEST
END RequestId: 0ac72d5f-ba0a-4604-868e-0f4179eae053
REPORT RequestId: 0ac72d5f-ba0a-4604-868e-0f4179eae053	Duration: 129.53 ms	Billed Duration: 130 ms	Memory Size: 256 MB	Max Memory Used: 129 MB	Init Duration: 1460.61 ms
"Hello Bill"%

So there's something going on when doing a native build.

bredlej avatar Feb 12 '24 12:02 bredlej

Do you have perhaps any ideas what to try out?

Unfortunately I am not familiar with AWS lambdas and the referenced scripts, so someone else will need to step in. Perhaps @matejvasek or @patriot1burke

zakkak avatar Feb 13 '24 18:02 zakkak

Did you try to add

        Architectures:
          - arm64

to sam.native.yaml (see https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/sam-resource-function.html#sam-resource-function-properties for more details)?

java17 has been made default in the meantime so this is no longer an issue...

deki avatar Jun 03 '24 07:06 deki

Hello, @bredlej, There is very little magic going on. Quarkus just creates a zip file with your executable and conf and that zip file is then uploaded to AWS and AWS just runs the executable on a Linux system. That's it. So it matters what Linux system is the executable built for, both CPU architecture and GLibc version. I encountered this kind of issue when I was testing https://github.com/quarkusio/quarkus/issues/23998

You need to configure your AWS Lambda to e.g. use Linux 1 or Linux 2, depending on what you target. You also need it to configure correctly either amd64 (intel) or aarch64 (ARM).

Your local Mac OS container build most likely uses Linux aarch64 builder image, so your binary would be Linux aarch64 one.

I am closing this as a non-Quarkus issue.

Karm avatar Jun 03 '24 08:06 Karm

Would be nice to have a quick little note of this ARM quirk somewhere in the quarkus documentation.

Perhaps here: https://quarkus.io/guides/aws-lambda-http#deploying-a-native-executable

Or here: https://quarkus.io/guides/aws-lambda-http#build-and-deploy

Where it's already telling you that you need to run a different command on Mac.

Thanks to your guy's help I was able to successfully deploy a native onto lambda, so we know it works. All I had to do was switch the architecture to ARM in my CDK deployment config.

rrowlands avatar Jun 26 '24 14:06 rrowlands

@rrowlands You're probably best person to add such note. Fancy sending us a PR?

galderz avatar Jul 08 '24 10:07 galderz