amplify-cli icon indicating copy to clipboard operation
amplify-cli copied to clipboard

Python lambda is 2.6mb when there is 0 dependencies

Open michaelbrewer opened this issue 3 years ago • 19 comments

Describe the bug When create a function with a very simple Python lambda, the generated lambda is 2MB

Amplify CLI Version

4.30.0

To Reproduce

Take pointless lambda like this and build it.

def handler(event, context):
     print("event:", event)

Expected behavior The deployed lambda should only be a couple bytes not 2.6mb

Screenshots

image

Desktop (please complete the following information):

  • OS: [e.g. Mac/Windows/Ubuntu] Mac
  • Node Version. You can use node -v to check the node version on your system v14.13.1

Additional context Add any other context about the problem here.

michaelbrewer avatar Oct 13 '20 04:10 michaelbrewer

Amplify CLI uses pipenv to bundle the resources to the cloud. Pipenv brings in virtual env, which adds size to the bundle.

yuth avatar Oct 13 '20 20:10 yuth

@yuth
What is the benefit of this? Why not just use AWS Sam to handle the bundling of the lambda? even CDK does not inflate the lambda by 2.6mb with zero benefits.

And FYI, Amplify does not even support Python 3.8

michaelbrewer avatar Oct 14 '20 05:10 michaelbrewer

@michaelbrewer You're right that pipenv is overkill for small functions that only have a couple dependencies, but it is good for managing functions with lots of dependencies. It is also a well adopted package management tool in the python ecosystem: https://packaging.python.org/tutorials/managing-dependencies/. We don't use AWS SAM because many of our customers don't use / have SAM installed.

We do support Python 3.8, so if you are having trouble there please open an issue with details of the problem.

edwardfoyle avatar Oct 14 '20 18:10 edwardfoyle

@edwardfoyle -

amplify console does not support Python 3.8 https://github.com/aws-amplify/amplify-console/issues/595

Most lambdas don't need many dependencies (and some have none at all outside of AWS SDK)

So in most cases just have a requirements.txt should be all that you would need.

Larger projects can use poetry.

michaelbrewer avatar Oct 14 '20 20:10 michaelbrewer

@edwardfoyle
i don't quite understand why the actual deployed lambda needs to be 2.6mb, these are not runtime dependencies but build time dependencies?

Here is the example lambda for python function for a cognito auth challenge:

def handler(event: dict, _):
    print("event", event)

    if len(event["request"]["session"]) == 0:
        event["request"]["challengeMetadata"] = "COOKIE_CHALLENGE"
        event["request"]["publicChallengeParameters"] = {}
        event["request"]["publicChallengeParameters"]["cookieName"] = "source"
        event["request"]["privateChallengeParameters"] = {}

    return event

I would expect the deployed lambda to be tiny

michaelbrewer avatar Oct 14 '20 21:10 michaelbrewer

Note how AWS-CDK does this without bloating the lambda with unused libraries : https://github.com/aws/aws-cdk/tree/master/packages/%40aws-cdk/aws-lambda-python

As a develop you just need to have Docker installed.

michaelbrewer avatar Oct 20 '20 07:10 michaelbrewer

@edwardfoyle @yuth - will anyone look into fixing this?

I can see that AWS does care about the cost of cold starts (https://aws.amazon.com/blogs/developer/modular-aws-sdk-for-javascript-release-candidate/), and this seems to be inconsistent.

michaelbrewer avatar Nov 10 '20 05:11 michaelbrewer

@michaelbrewer We'll look into this. Sorry for the late response.

kaustavghosh06 avatar Dec 24 '20 19:12 kaustavghosh06

Maybe @heitorlessa has some input on how to support Python lambdas in a clean way. It would be nice to support SAM or do what CDK does when building Lambda functions (while keeping the size down)

michaelbrewer avatar Jan 21 '21 02:01 michaelbrewer

Hopefully the tools improves soon. As we are planing to build all of our lambdas in Python

michaelbrewer avatar Jan 29 '21 19:01 michaelbrewer

@kaustavghosh06 while I don't have the bandwidth to contribute code I'm happy to review or chat

You can keep pipenv but only extract the dependencies themselves instead of bringing the whole virtual env - it's not necessary.

pipenv here will also break when customers bring dependencies that rely on C, as it needs to build within a Linux env -- hence Docker suggestion by Michael (a flag would do).

heitorlessa avatar Jan 29 '21 20:01 heitorlessa

Thanks @heitorlessa for offering an ear for this.

I do like how sam can bootstrap a lambda with the build tools and sample code:

sam init --location https://github.com/aws-samples/cookiecutter-aws-sam-python

Maybe this can be an option @kaustavghosh06 . When can we expect to at least not bring the whole virtual env with the lambda?

michaelbrewer avatar Feb 03 '21 18:02 michaelbrewer

@kaustavghosh06 what is the timeline on this? Otherwise amplify function build does not seem to work, it just hangs without completing.

michaelbrewer avatar Feb 13 '21 12:02 michaelbrewer

RE: amplify function build failing

Oh i see that this is a separate issue (which is NOT being fixed):

  • https://github.com/aws-amplify/amplify-cli/issues/6159

michaelbrewer avatar Feb 13 '21 12:02 michaelbrewer

@eddiekeller @kaustavghosh06 - is anyone looking into this? How easy would it be to fix this ourselves in the CLI?

michaelbrewer avatar Jul 09 '21 20:07 michaelbrewer

How about doing something along the lines of :

# Create an requirements.txt in src/ (which can be gitignored)
pipenv lock -r > src/requirements.txt

Then build/download the dependencies using docker image lambci/lambda:build-python3.8

# Download the vendored deps
pip install -r requirements.txt -t /vendored && cp -au . /vendored

michaelbrewer avatar Jul 10 '21 05:07 michaelbrewer

Found a hack to make amplify stop building all these nondependent-dependencies into my Python functions. This will only work for functions that do not have any dependencies that aren't already provided in lambda layers, as it just completely stops the function build process. There is an amplify.state file inside of each function. If you don't want amplify to build your function, modify the amplify.state file to make amplify "think" its a nodejs function.

amplify.state (original, will build virtualenv into the deployed package)

{
  "pluginId": "amplify-python-function-runtime-provider",
  "functionRuntime": "python",
  "useLegacyBuild": false,
  "defaultEditorFile": "src/index.py"
}

amplify.state (new - will NOT build any python packages but otherwise the Lambda function will work the same in AWS)

{
  "pluginId": "amplify-nodejs-function-runtime-provider",
  "functionRuntime": "nodejs",
  "useLegacyBuild": false,
  "defaultEditorFile": "src/index.py"
}

samjett247 avatar Oct 11 '21 01:10 samjett247

Same Issue Here

tmirun avatar Dec 31 '21 17:12 tmirun

There is a comment on an alternative way to run pipenv in the following comment which allows installation of dependencies without copying the virtualenv. Ideally the amplify project may use this method instead:

https://github.com/pypa/pipenv/issues/746#issuecomment-416475131

joekiller avatar Jul 21 '22 17:07 joekiller

Amplify is still including the pipenv venv bundle in function deployments? Is this a joke?

speedhawk21 avatar Apr 27 '23 22:04 speedhawk21