aws-cdk icon indicating copy to clipboard operation
aws-cdk copied to clipboard

(lambda-python): arm64 architecture is not respected

Open kornicameister opened this issue 3 years ago • 20 comments

What is the problem?

Hey,

Not quite sure if that's a problem with lambda or with codepipelines or at all with cdk. Been trying to understand it but I just couldn't. Here's what is happening.

I have a pipelines.CodePipeline that is configured with code_build_defaults that states that I wish to run builds with LinuxBuildImage.AMAZON_LINUX_2_ARM_2. Obviously later on my lambda functions are configured as follow:

PythonFunction(..., runtime=Runtime.PYTHON_3_9, architecture=Architecture.ARM_64)

That being said I expect that when CodeBuilds runs my synth step it will run on top of arm64 and lambda will be build using the same architecture. However in the logs of job I see this:

111 | Status: Downloaded newer image for public.ecr.aws/sam/build-python3.9:latest
112 | ---> 7925e2bf1015
113 | Step 3/8 : ARG PIP_INDEX_URL
114 | ---> [Warning] The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested
115 | ---> Running in 47078dfc29f5
116 | Removing intermediate container 47078dfc29f5
117 | ---> b95a315c94b2
118 | Step 4/8 : ARG PIP_EXTRA_INDEX_URL
119 | ---> [Warning] The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested
120 | ---> Running in 684bb68c4f18
121 | Removing intermediate container 684bb68c4f18
122 | ---> 175da3dd341e
123 | Step 5/8 : ARG HTTPS_PROXY
124 | ---> [Warning] The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested
125 | ---> Running in 15976d15e0ae
126 | Removing intermediate container 15976d15e0ae
127 | ---> 28715ae8a9d6
128 | Step 6/8 : RUN pip install --upgrade pip
129 | ---> [Warning] The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested
130 | ---> Running in 515be8ae3321
131 | standard_init_linux.go:228: exec user process caused: exec format error
132 | The command '/bin/sh -c pip install --upgrade pip' returned a non-zero code: 1
133 | jsii.errors.JavaScriptError:

Which seems to contradict to my desire setup of arm64. I expected public.ecr.aws/sam/build-python3.9:latest-arm64 would be downloaded instead of public.ecr.aws/sam/build-python3.9:latest (lack or arm64) in the end.

That happens with cdk 2.8.0. Last successful build I managed to execute with here was done with cdk 1.134.0.

PS. My stack defines also AwsCustomResoure which is another lambda deployment. Might it be that arm64 and custom resource lambda conflicts each other?

Reproduction Steps

import typing as t

import aws_cdk as cdk
from aws_cdk import (
    aws_codebuild as cb,
    aws_codepipeline as cp,
    aws_iam as iam,
    aws_lambda as fn,
    aws_lambda_python as fn_python,
    pipelines,
)
from constructs import Construct

class Pipeline(cdk.Stack):
    def __init__(
        self,
        scope: Construct,
        construct_id: str,
        **kwargs: t.Any,
    ) -> None:
        super().__init__(
            scope,
            construct_id,
            **kwargs,
        )

        connection_arn =  '' # some connection arn here
        repo_name = '' # some repo name here

        pipeline = pipelines.CodePipeline(
            self,
            'Pipeline',
            pipeline_name='IAMSetupPipeline',
            synth=pipelines.ShellStep(
                'Synth',
                input=pipelines.CodePipelineSource.connection(
                    repo_name
                    'master',
                    connection_arn=connection_arn,
                ),
                install_commands=[
                    'npm install -g aws-cdk@latest',
                    'pip install -r requirements.txt',
                ],
                commands=['cdk synth pipeline'],
            ),
            code_build_defaults=pipelines.CodeBuildOptions(
                build_environment=cb.BuildEnvironment(
                    build_image=cb.LinuxBuildImage.AMAZON_LINUX_2_ARM_2,
                ),
            ),
            self_mutation=True,
            docker_enabled_for_synth=True,
        )

        # stages
        pipeline.add_stage(
            AppStage(
                self,
               'bug',
                env=cdk.Environment(
                    account='000000000000',
                    region='eu-central-1',
                ),
            ),
        )

        pipeline.build_pipeline()


class AppStage(cdk.Stage):
    def __init__(
        self,
        scope: Construct,
        stage_id: str,
        **kwargs: t.Any,
    ) -> None:
        super().__init__(scope, stage_id, **kwargs)
        LambdaStack(self, 'stack')

class LambdaStack(cdk.Stack):
    def __init__(
        self,
        scope: Construct,
        construct_id: str,
        **kwargs: t.Any,
    ) -> None:
        super().__init__(
            scope,
            construct_id,
            **kwargs,
        )
 
        fn_python.PythonFunction(
            self,
            'ARM',
            entry='/your/entry',
            runtime=fn.Runtime.PYTHON_3_9,
            architecture=fn.Architecture.ARM_64,
        )

app = cdk.App()
Pipeline(
    app, 
    'Pipeline', 
    env=cdk.Environment(
                    account='000000000000',
                    region='eu-central-1',
))
app.synth()

CODE is not exactly to be run as-is. It lacks a python lambda code (this is what I have been using) and it lacks proper account and region

What did you expect to happen?

Stack is built correctly inside of deployed pipeline and ARM64 is used.

What actually happened?

Stack fails to built with code linked in first section.

CDK CLI Version

2.8.0

Framework Version

No response

Node.js Version

14.17.5

OS

MacOs BigSur

Language

Python

Language Version

3.10.1

Other information

If I messed something up. I am sorry :(

kornicameister avatar Jan 27 '22 18:01 kornicameister

Managed to successfully executed code build step with pipelines deployed with cdk 2.2.0 Something must have changed between 2.2.0 and 2.8.0 in such case.

kornicameister avatar Jan 28 '22 10:01 kornicameister

Although in such case...lambda bundle is empty.

kornicameister avatar Jan 28 '22 15:01 kornicameister

If the Docker image is built on an M1 chip and uploaded to be deployed by Fargate or another similar AWS service then you’ll notice this container error:

standard_init_linux.go:228: exec user process caused: exec format error

There’s a couple ways to work around this. You can either:

  • Build your docker image using:
docker buildx build --platform=linux/amd64 -t image-name:version .
  • Update your Dockerfile’s FROM statements with
FROM --platform=linux/amd64 BASE_IMAGE:VERSION

ryparker avatar Jan 31 '22 21:01 ryparker

Yeah but the point in here is I am not using Fargate. Problem is all about cdk running within codebuild and building my lambda functions and that process failing because I want my lambda to be arm64 (not amd64).

kornicameister avatar Jan 31 '22 22:01 kornicameister

Are you building with an M1 chip?

ryparker avatar Jan 31 '22 22:01 ryparker

How can I know that? I just setup pipelines.CodePipeline to use codebuild.LinuxBuildImage.AMAZON_LINUX_2_ARM_2 for lambda.Function defined with architecture=lambda.Architecture.ARM_64 to build. I don't see how this question is relevant.

kornicameister avatar Jan 31 '22 22:01 kornicameister

The image is built locally on the computer you're running cdk deploy from. We've noticed this type of error when using an M1 chip to build the image. You typically won't see the error during deploy but you will notice it in AWS when the container is ran.

ryparker avatar Jan 31 '22 22:01 ryparker

Locally I am building lambda using amd64. It was not clear in original message I see (sorry) but I wrote that

However in the logs of job I see this:

Which was my vague attempt to point that logs and situation happens within CodeBuild

kornicameister avatar Jan 31 '22 22:01 kornicameister

Or in other words, I am not deploying my stacks thus lambda from local machine (my laptop). I am using pipelines and as such combination of AWS Code Tools services like AWS Code Pipeline, AWS Code Build and others.

kornicameister avatar Jan 31 '22 22:01 kornicameister

Oh gotcha. Let me try to reproduce this on my end. I'll get back to you soon.

ryparker avatar Jan 31 '22 22:01 ryparker

Ok I was able to reproduce your issue and found a solution:

Try updating the PythonFunction's runtime to use ARM 64 Python.

i.g.

 fn_python.PythonFunction(
            self,
            'ARM',
            entry='lambda',
            runtime=fn.Runtime('python3.9:latest-arm64', fn.RuntimeFamily.PYTHON), # <-- specify custom runtime
            architecture=fn.Architecture.ARM_64,
        )

ryparker avatar Feb 01 '22 21:02 ryparker

Is it a solution or workaround? That used to work without that piece of code in cdk@1 (1.134.0 to be exact). I have another stack I haven't migrated to cdk@2 because of this issue.

kornicameister avatar Feb 01 '22 22:02 kornicameister

Not to mention that "change" like this kind of defeats a purpose of using architecture property because I have to specify architecture again within runtime. What if specify different values here and there?

kornicameister avatar Feb 01 '22 22:02 kornicameister

Is it a solution or workaround? That used to work without that piece of code in cdk@1 (1.134.0 to be exact). I have another stack I haven't migrated to cdk@2 because of this issue. Not to mention that "change" like this kind of defeats a purpose of using architecture property because I have to specify architecture again within runtime. What if specify different values here and there?

That's a good point. If this worked before then we will leave this as a bug. The fix should set the runtime according to the provided architecture.

ryparker avatar Feb 01 '22 22:02 ryparker

@ryparker after some further testing I managed to find out that stacks gets deployed in 1.137.0. If I go just one version up, stacks fails to deploy with original error up in the CodeBuild environment.

kornicameister avatar Feb 02 '22 05:02 kornicameister

@ryparker shouldn't this actually be p1. Not familiar with assessment process but based on other issues I do feel like cdk-team prioritizes issues when something that used to work stopped with p1 which means higher prio and someone from core team looking at this?

kornicameister avatar Feb 07 '22 10:02 kornicameister

@ryparker

Ok I was able to reproduce your issue and found a solution:

Try updating the PythonFunction's runtime to use ARM 64 Python.

i.g.

 fn_python.PythonFunction(
            self,
            'ARM',
            entry='lambda',
            runtime=fn.Runtime('python3.9:latest-arm64', fn.RuntimeFamily.PYTHON), # <-- specify custom runtime
            architecture=fn.Architecture.ARM_64,
        )

@ryparker Could you help me with this problem? The following error occurs:

Resource handler returned message: "Value python3.9:latest-arm64 at 'runtime' failed to satisfy constraint: Member must satisfy enum value set: [nodejs12.x, python3.6, provided, nodejs14.x, ruby2.7, java11, dotnet6, go1.x, provided.al2, java8, java8.al2, dotnetcore3.1, python3.7, python3.8, python3.9] or be a valid ARN (Service: Lambda, Status Code: 400, Request ID: 3d9ba725-5e91-4195-b289-5c44e4f7d0c2, Extended Request ID: null)" (RequestToken: 2571186d-8ab7-c79a-8b05-4870981b5b74, HandlerErrorCode: InvalidRequest)

fn := awslambdapython.NewPythonFunction(stack, jsii.String("ddb-to-opensearch"), &awslambdapython.PythonFunctionProps{
	Architecture: awslambda.Architecture_ARM_64(),
	Runtime:      awslambda.NewRuntime(jsii.String("python3.9:latest-arm64"), awslambda.RuntimeFamily_PYTHON, nil),
	Entry:        jsii.String(codeDir),
	LogRetention: awslogs.RetentionDays_ONE_MONTH,
})

hxy1991 avatar Mar 08 '22 11:03 hxy1991

@hxy1991 This is a different error from the one OP posted and possibly only effects Go. Would you mind opening this in a new issue?

ryparker avatar Mar 09 '22 21:03 ryparker

fn := awslambdapython.NewPythonFunction(stack, jsii.String("ddb-to-opensearch"), &awslambdapython.PythonFunctionProps{
	Architecture: awslambda.Architecture_ARM_64(),
	Runtime:      awslambda.NewRuntime(jsii.String("python3.9:latest-arm64"), awslambda.RuntimeFamily_PYTHON, nil),
	Entry:        jsii.String(codeDir),
	LogRetention: awslogs.RetentionDays_ONE_MONTH,
})

We solved this problem by setting "Bundling".

fn := awslambdapython.NewPythonFunction(stack, jsii.String("ddb-to-opensearch"), &awslambdapython.PythonFunctionProps{
	Architecture: awslambda.Architecture_ARM_64(),
	Runtime: awslambda.Runtime_PYTHON_3_9(),
	Entry:        jsii.String(codeDir),
	LogRetention: awslogs.RetentionDays_ONE_MONTH,

	// We solved this problem by setting "Bundling"
	Bundling: &awslambdapython.BundlingOptions{
		Image: awscdk.DockerImage_FromRegistry(jsii.String("public.ecr.aws/sam/build-python3.9:latest-arm64")),
	},

	……
})

hxy1991 avatar Mar 28 '22 09:03 hxy1991

@hxy1991 but that is a lot of information redundancy in just 8 lines. You've repeated python version 2 times and architecture also 2 times. But it's a workaround of sort.

kornicameister avatar Mar 28 '22 09:03 kornicameister

This also happens when building lambda layers and the workaround above does not work unfortunately.

ddhanak avatar Aug 14 '23 15:08 ddhanak

@kornicameister is it possible to create a repo with a project that reproduces this error? I wasn't immediately able to reproduce with the code in the original description. Thanks!

mikewrighton avatar Oct 12 '23 13:10 mikewrighton

@mikewrighton sorry but no. I am booked all day long and although still using cdk I have my hands full

kornicameister avatar Oct 12 '23 15:10 kornicameister

I believe I had a similar issue recently, using typescript, I have a CDK Pipeline with a CodeBuild synth step running aws/codebuild/amazonlinux2-aarch64-standard:3.0 to bundle a PythonLambda with a target ARM64 architecture from @aws-cdk/[email protected].

The docs (https://docs.aws.amazon.com/cdk/api/v2/docs/aws-lambda-python-alpha-readme.html) say:

...and with the Docker platform based on the target architecture of the Lambda function.

This doesn't seem to be the case looking at the code and from my experiments, the CodeBuild image while bundling outputs:

[Warning] The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested

When executing the ARG and ENV commands from this Dockerfile, then the RUN command fails (as expected if the platform is wrong):

exec /bin/sh: exec format error
The command '/bin/sh -c python -m venv /usr/app/venv &&     mkdir /tmp/pip-cache &&     chmod -R 777 /tmp/pip-cache &&     pip install --upgrade pip &&     mkdir /tmp/poetry-cache &&     chmod -R 777 /tmp/poetry-cache &&     pip install pipenv==2022.4.8 poetry==$POETRY_VERSION &&     rm -rf /tmp/pip-cache/* /tmp/poetry-cache/*' returned a non-zero code: 1

And I can actually see the docker build command was executed with the wrong platform:

Command: docker build -t cdk-redacted-id --platform "linux/amd64" --build-arg "IMAGE=public.ecr.aws/sam/build-python3.11" "/codebuild/output..."

In the contract of PythonFunction we get BundlingOptions however when bundling BundlingProps is used and it accepts an architecture attribute.

Creating the Lambda specifying the platform in bundling options is ignored completely, the output is as above.

super(scope, id, {
      runtime: Runtime.PYTHON_3_11,
      bundling: {
        platform: 'linux/arm64',
      },
      architecture: Architecture.ARM_64,
      ...

But if I force the architecture (note the use of as any, since I can't technically pass in architecture through BundlingOption, only with BundlingProps):

super(scope, id, {
      runtime: Runtime.PYTHON_3_11,
      bundling: {
          architecture: Architecture.ARM_64,
      } as any,
      architecture: Architecture.ARM_64,
      ...

This works, because the code that decides on the platform is here platform: architecture.dockerPlatform,, architecture is defined above architecture = Architecture.X86_64,.

I think the contract should change, architecture should be an attribute of BundlingOptions (since it is possible in some situations to want to target one architecture while building in another), regardless, the current behaviour is not what the docs describe and so I would call it a bug.

To me it seems like making some small changes here would fix the bug, then there could be a feature request to move architecture to BundlingOptions, here is the change:

code: Bundling.bundle({
        entry,
        runtime,
        skip: !Stack.of(scope).bundlingRequired,
        architecture: props.architecture,  // define architecture based on the target architecture of the function
        ...props.bundling,
      }),

lucacucchetti avatar Nov 07 '23 10:11 lucacucchetti

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see. If you need more assistance, please either tag a team member or open a new issue that references this one. If you wish to keep having a conversation with other community members under this issue feel free to do so.

github-actions[bot] avatar Dec 21 '23 21:12 github-actions[bot]