serverless-python-requirements
serverless-python-requirements copied to clipboard
Can't noDeploy numpy when pandas in requirements.txt
Apologies if this is a duplicate, I tried to search for issues. Also, I really appreciate the great work to create this plugin, I used serverless 1-2 years ago before layers and libraries like pandas on AWS was a significant pain.
I am using python and trying to use pandas on AWS Lambda. I have managed to make everything work with a minimal requirements.text.
Bottleneck==1.2.1
certifi==2019.9.11
numexpr==2.7.0
numpy==1.17.2
pandas==0.25.1
python-dateutil==2.8.0
pytz==2019.3
six==1.12.0
However, even if I specify numpy in noDeploy option, it still seems to be appearing in .requirements.zip

Here's my serverless.yml file section on serverless-python-requirements
custom:
pythonRequirements:
fileName: requirements.txt
dockerizePip: true
useStaticCache: false
useDownloadCache: false
zip: true # Compresses the libraries in additional file and addsunzip_requirements.py in the final bundle.
slim: true # Removes unneeded files and directories such as *.so, *.pyc, dist-info, etc.
noDeploy: # Omits certain packages from deployment.
- boto3
- botocore
- docutils
- jmespath
- pip
- python-dateutil
- s3transfer
- setuptools
- six
- numpy
layer: true
My goal is to deploy this layer without numpy to save space, and then use the built in AWS SciPy NumPy layer. I am not a requirements.txt expert as I tend to use conda not pip.
Am I doing something wrong?
Thanks so much for all the work on this, Aaron
did you solve it?
I got the idea. Current noDeploy implementation will only exclude the package from requirements.txt it generates. Because since pandas requires numpy, it will added numpy back thus make noDeploy has no effect.
A fix is not so simple due to we use pip install -t while no corresponding uninstall command applied for that. Maybe we should add a config to explicitly exclude files from resulting Zip?
(have to exclude typing for similar issue)
I found a very dirty workaround on that:
slim: true # otherwise slimPatterns will not work
strip: false # avoid some ELF alignment issues
slimPatternsAppendDefaults: false
slimPatterns:
# Won't work with noDeploy since
# dependencies will go back
- numpy/**
# Exclude **/*.dist-info* may cause trouble
- '**/*.py[c|o]'
- '**/__pycache__*'
YMMV.
I'm removing the "bug" classification from this - since it seems like the issue was with transitive dependencies of pandas.
Had you tried packaging pandas along with numpy in your layer and excluding it from the requirements.txt, and see if that works out?
In my case, I'm using a package called sqlalchemy-aurora-data-api which depends on boto3. Unfortunately, this means that my Lambdas will deploy with boto3 even though the library is already included in the Lambda environment. While this may not be a bug, I do think it's a bit unintuitive that noDeploy can't handle transitive dependencies. I'm currently using the workaround @littlebtc provided.
@miketheman I respectfully disagree, this is a bug. The attribute noDeploy of pythonRequirements is currently documented in the README as:
You can omit a package from deployment with the noDeploy option. Note that dependencies of omitted packages must explicitly be omitted too.
As the original comment indicates, this is not the case when another package has numpy (or other packages) as its dependency. We have packaged pandas, numpy, and scipy in our layer, but we cannot use any library that depends on them (i.e., statsmodels), because then we'd run into this issue.
While @littlebtc 's solution works, it would be nice if noDeploy prevented packages from being deployed.