amplify-cli icon indicating copy to clipboard operation
amplify-cli copied to clipboard

Option to not install function node_modules on push

Open javamonn opened this issue 4 years ago • 30 comments

Describe the bug Amplify appears to be automatically installing function node_modules on push.

I have several Node.js lambda functions. I package the lambda entry point with required run-time dependencies as a pre-push step in order to reduce the published zip size. All of these dependencies are included under devDependencies instead of dependencies as they are not technically needed at run-time - they've been packaged up.

When I execute amplify push or amplify function push, I see all dependencies installed under each function shortly after executing the command. I imagine this was a decision made to ease deployment, but it creates issues in my situation where I consciously do not want the extra dependencies included in the Lambda zip, as I run into a Lambda size error (Unzipped size must be smaller than 262144000 bytes). If the extra install step is deliberate, is there a reason why it isn't a production install, i.e. npm install --production, so that devDependencies are not installed?

Amplify CLI Version 4.23.0

To Reproduce

  1. Create a Lambda function with a devDepenency
  2. Execute amplify push
  3. Observe that the devDependency has been installed within the function node_modules.

Expected behavior Amplify either does not automatically install node_modules, or makes the behavior configurable via a CLI argument.

Desktop (please complete the following information):

  • OS: Debian 10
  • Node Version. 12.13.1

Additional context I see there is an amplify function build that automatically installs function node_modules. Is this command getting run on push?

javamonn avatar Jul 14 '20 13:07 javamonn

@javamonn - I might need more context here but have you considered moving your dependencies into a Lambda layer? https://docs.amplify.aws/cli/function/layers - It would speed up your deployments speeds and you wouldn't need to hassle with any custom pre-deployment steps. The total unzipped size of your functions + layers could still not exceed 250MB though, so I'm not sure if this is applicable for you use case. cc @jhockett @UnleashedMind

renebrandel avatar Jul 14 '20 15:07 renebrandel

@renebrandel Thanks for the suggestion. I'm not sure Layers solve my issue, unless the build process with respect to dependencies is a little different than vanilla lambda functions.

My issue distills down to the fact that my development-time dependencies are larger than 250MB, but my expected set of run-time dependencies falls under the limit. An example package.json that encounters this problem can be seen here - it's only using devDependencies, not dependencies.

My build process for functions like this right now is roughly:

  1. Execute npm install for each function - i.e. install everything.
  2. Run each function through Webpack, building a single bundle.js file with all run-time dependencies inline, with exception of things that don't bundle well.
  3. Remove node_modules (as nearly everything has been bundled) and re-install the select set of production dependencies still needed at runtime (i.e. only non devDependencies) with npm install --production
  4. Run amplify push to deploy the functions.

At this point, each function comes in under the size limit. It seems like Amplify is running a separate npm install as part of the push process though for each function, which re-installs everything. I'd expect for that install to be configurable, or be a --production install by default, as it doesn't make sense to deploy development dependencies to a production environment.

javamonn avatar Jul 14 '20 16:07 javamonn

I agree, this needs some attention. Ideally I'd like to be able to specify not to run the npm/yarn install at all, but at minimum we should be able to specify that the install run with the production flag set. I agree with @javamonn, I don't know why that isn't the default.

We have a pre build script that runs a somewhat similar process to what @javamonn is describing. For us it's install, typescript compile out to a different folder, then install again to that folder with the production flag set, i.e. only dependencies. So we really don't need the install step as part of the amplifyPush at all, but we definitely don't want it to go ahead and dump all our devDependencies back in that we carefully pruned out.

djsmedes avatar Jul 24 '20 18:07 djsmedes

I noticed that our lambda sizes were bigger than expected (8mb vs 10kb) and I assume this is the issue? e.g. the aws sdk being installed and deployed?

Zanndorin avatar Jul 27 '20 09:07 Zanndorin

Seems likely. You can check by going to the lambda in the web console, actions -> export function, "download deployment package". Then inspect the zip file in a file browser or however you prefer.

djsmedes avatar Jul 27 '20 14:07 djsmedes

Yeah I have been down this path before :) Too lazy atm, I'll just assume that it is this likely problem :D

Zanndorin avatar Jul 27 '20 14:07 Zanndorin

I ended up working around this by moving the primary package.json up a directory in the path hierarchy, i.e. at the same level as the CloudFormation template for the function, and leaving original package.json to capture only the required run time dependencies. This lets me have more control over the code that gets published to Lambda, while still working fine as node's module resolution algorithm will traverse up the path hierarchy looking for node_modules.

The one odd thing about is that I still need to prune the build time node_modules in a post-build step as otherwise amplify includes them in the overall environment state zip, which makes further deploys and environment pulls a lot slower.

javamonn avatar Jul 30 '20 17:07 javamonn

We've worked around this by putting all our source files in a separate, sibling directory to src. We've named it pkg. Then adjusted our tsconfig.json settings such that the built js files get built into src. Then we copy over package.json, stripping out devDependencies with jq (which you have to install in your build script higher up with yum install jq). We also copy over yarn.lock for good measure. So we end up not needing to maintain multiple package.json files.

The final piece of the puzzle for us is a script in our package.json called "install for debug" that does the following:

yarn install --production --check-files --modules-folder=../src/node_modules && npm i --no-save --prefix ../ aws-sdk

This is run from pkg. It installs only the runtime dependencies into its sibling src/node_modules, then installs aws-sdk a level up, taking advantage of the node module resolution algorithm similar to your solution @javamonn. All this does is allow us to simulate the actual conditions of what packages will be available in the cloud when we're running the lambda locally, so it's not actually essential.


Despite @javamonn and I both finding functional workarounds, I don't think this issue should be closed. I get why there's an install step built into amplify push - you need your dependencies to be around so they can be zipped up into the deployment package - but it's baffling that it doesn't run in production mode and can't even be configured to do so.

Given how often people customize their build settings, it would be nice to have the option to just turn off the install part of amplify push entirely. Include a console warning that says "You better know what you're doing" if you use such an option if that makes it more likely to be implemented 🤷

djsmedes avatar Jul 30 '20 22:07 djsmedes

In my case, I run npm install using a docker image so I can ensure I install the linux (not darwin) version of Sharp (image processing) gets installed. When I push the function to AWS, it ends up getting the darwin install, I can only imaging its running a local npm install again before the push, and not from docker where I can ensure the correct version gets installed.

kevcam4891 avatar Aug 06 '20 02:08 kevcam4891

My main concern is preventing npm install from running (which appears to be called by installDependencies() in ./packages/amplify-nodejs-function-runtime-provider/src/utils/legacyBuild.ts) because as I mentioned above, I've already run npm install in a custom docker container lambci so that I can install the linux distro of sharp on my mac, rather than the darwin version.

I found that by manually updating the lastBuildTimeStamp property in amplify-meta.json for the function I manually built to any date AFTER I actually ran npm install, I can cause amplify function push from running installDependencies(). I might also have to be careful about updating the timestamp if I modify anything else, like my own source code, in the directory cause that might cause npm install to run again.

This might be a suitable workaround for others.

I don't recall this being the behavior in version's past. I recall running npm install and it never getting called again during a push, and I haven't been able to spend time diffing the source code to see where behavior changed.

kevcam4891 avatar Aug 06 '20 14:08 kevcam4891

After digging through cli versions, it appears the behavior changes 4.16 -> 4.17. The logic that determines whether installDependencies() gets called now checks to see if ANY file in the lambda directory is changes (stat.mtime > lastBuildTimeStamp). Before, it would just check the mtime for package.json.

I'm not sure what's better: being able to keep a node_modules folder static between multiple pushes (nice during dev), or forcing an install at any possible sign of the directory being "stale" (better for prod). But I'd like to have the ability to not run installDependencies when pushing.

Anyone on the AWS side have any particular way to lean on a solution for this? Maybe amplify push --skip-install-pkg-deps?

kevcam4891 avatar Aug 06 '20 15:08 kevcam4891

Nothing new here? I just want to be able to edit my minimal functions in the console but they are too large because of the devDependencies :(

Zanndorin avatar Aug 18 '20 14:08 Zanndorin

@Zanndorin Use Lambda Layers. 100%.

I was slow to the party as well, as I've just started implementing them this week in amplify. Read the docs for info, but layers should be used to store all your commonly used dependencies. Let your actual Lambda (not layer) serve as the bare bones business logic. Your function can reference modules/code in these layers. Your resulting Lambda is super small, which again exposes the editor in the Lambda screen.

So this comment actually removes any sort of issue I had with this functionality - and a reason I agreed this ticket was a "bug". I make a "Sharp" layer which needed to be built one time on a linux architecture. Then I reference that in other Lambdas, but changing my business logic doesn't cause me to have to rebuild Sharp. So cool 😄 .

@javamonn I'd recommend taking a fresh look at Layers as well. It may seem like more of a hassle at first, but I think this is your solution. More durable in the long run as well.

kevcam4891 avatar Aug 22 '20 20:08 kevcam4891

@kevcam4891 I understand your solution. But I have no dependencies. I have 1 dev dependency (aws-sdk). The lambda is 9.1mb. I can remove the devDependency and it would still work (and be about 1kb) but I would not be able to test it locally anymore.

{
  "name": "testfunction",
  "version": "2.0.0",
  "description": "Lambda function generated by Amplify",
  "main": "index.js",
  "license": "Apache-2.0",
  "devDependencies": {
    "aws-sdk": "^2.719.0"
  }
}

Zanndorin avatar Aug 24 '20 09:08 Zanndorin

@Zanndorin Have you tried using the built in aws-sdk that comes with all node Lambda runtimes? It's not guaranteed to be the most up-to-date, but I've not come across a situation where its been a problem. You can just import AWS from "aws-sdk"; and it should be usable in Lambda without a special install. Can you try that out and let me know?

EDIT: (Sorry my coffee is just now catching up.) Let me run a couple checks and I'll reply back in a minute.

kevcam4891 avatar Aug 24 '20 12:08 kevcam4891

@kevcam4891 Yes ofcourse I can remove it, and I do that sometimes. But that then makes it not run locally i.e. you have to add a line package.json each time gets weird.

Zanndorin avatar Aug 24 '20 12:08 Zanndorin

I created a Lambda locally:

Scanning for plugins...
Plugin scan successful
? Select which capability you want to add: Lambda function (serverless function)
? Provide a friendly name for your resource to be used as a label for this category i
n the project: Test
? Provide the AWS Lambda function name: Test
? Choose the runtime that you want to use: NodeJS
? Choose the function template that you want to use: Hello World
? Do you want to access other resources in this project from your Lambda function? No
? Do you want to invoke this function on a recurring schedule? No
? Do you want to configure Lambda layers for this function? No
? Do you want to edit the local lambda function now? Yes
Please edit the file in your editor: /Users/[redacted]/amplify/backend/function/Test/src/index.js
? Press enter to continue 
Successfully added resource Test locally.

Here's my index.js:

const AWS = require("aws-sdk");

exports.handler = async (event) => {
  // test just to see if we have aws
  const sts = new AWS.STS();
  console.log(sts.getCallerIdentity());

  // TODO implement
  const response = {
    statusCode: 200,
    body: JSON.stringify("Hello from Lambda!"),
  };
  return response;
};

Then I call amplify mock function Test:

$ amplify mock function Test
? Provide the path to the event JSON object relative to /Users/[redacted]/amplify/backend/function/Test src/event.json
Starting execution...

I get proof that AWS works as expected.

Are you invoking in some other way that doesn't not use the Lambda runtime? Doing something like just a plain node index.js won't give you a Lambda runtime, so you might want to try using the Amplify way.

There's also another way invoke Lambdas using "lambci/docker", which I've switched to. I won't go into it on this reply so as not to complicate things, but I want to at least make sure that at this juncture you get AWS tools the way I'm suggesting.

kevcam4891 avatar Aug 24 '20 12:08 kevcam4891

I'll check if the probem is between the monitor and the chair... 😊

Zanndorin avatar Aug 24 '20 12:08 Zanndorin

I should have mentioned, I do NOT add any dependencies or devDependencies in my process. Yes, good luck! Let me know.

kevcam4891 avatar Aug 24 '20 12:08 kevcam4891

Seems to work. It's a bit weird having those "notfound" errors in your IDE locally and weird that I have to remove devDependencies (especially things Amplify adds themselves, for example

"devDependencies": {
    "aws-lambda-build": "^1.0.8",
    "aws-sdk": "^2.724.0"
  }

This should still be fixed imo...

Zanndorin avatar Aug 24 '20 12:08 Zanndorin

I hadn't tried building it, and yes, I'm experiencing the same. Not sure why aws-sdk is being built into the dist/latest-build.zip when specified in a devDependencies. Can someone from the dev team explain what's happening here? Seems like everything in node_modules is being included in the zip file.

kevcam4891 avatar Aug 24 '20 12:08 kevcam4891

@Zanndorin for now you might want to try including aws-sdk higher up in your project as a devDependencies. My IDE (vscode) seems to not throw not found errors because its seeing it higher up. Autocomplete works, etc. Then you can remove it completely from your Lambda folder.

kevcam4891 avatar Aug 24 '20 12:08 kevcam4891

Same problem. It's kind of frustrating since lambda functions have trouble dealing with native code dependencies. It's also frustrating since we have to comply with the 250MB limit, but we can't even specify the --production flag. Being able to configure their installation during amplify push would make a lot of sense.

I agree with @djsmedes and @kevcam4891. Specifying a no-dependency-install flag in one of the many config files (amplify-meta.json, amplify.state, cloudformation-template.json, etc) would give us the liberty to handle the installation ourselves.

It doesn't have to be a permanent solution. It can help us get unblocked while amplify comes up with a more complete alternative.

donalatorre avatar Jun 21 '21 15:06 donalatorre

So I am utilising ncc, which creates a single js file with all node_modules dependencies inlined. I don't need the node_modules at all.

ctrlplusb avatar Jul 14 '21 13:07 ctrlplusb

Looks like the meat of this issue was basically replicated in #7696 and fixed in #7812, which is live in CLI versions >= 5.2.1.

I say "the meat of" because the actual title requests, and some commenters in here (including me, many moons ago) requested the option to not install node_modules at all as part of a push. This is something that only an advanced user would have reason to do, if you somehow knew for sure that some other part of your custom build process would guarantee that the right dependency code was in place. It actually doesn't even apply to me anymore.

So this is now an ask for an advanced-user-only option that has the potential to be a trap for newbies - the actual runtime dependencies have to be there in node_modules in order to create a zip file that will actually result in a functioning lambda when you upload it.

djsmedes avatar Aug 03 '21 01:08 djsmedes

I'm having the same problem too. Can't we just have an option flag to skip yarn --production? So we can handle our custom install flow by using hooks? My problem is that yarn --production is not enough. I need to run yarn --production --frozen-lockfile --non-interactive --ignore-scripts after yarn install in order to just remove my devDependencies.

romeubertho avatar Sep 14 '21 22:09 romeubertho

Also struggling with this currently. My setup is using Turborepo and local dependencies. Now I'm trying to amplify push but it will throw an error as it can't find the local dependency in the NPM registry (obviously)...

flogy avatar Jul 02 '22 15:07 flogy

Since #10293 the yarn install command executed as part of amplify build function / amplify push is called with the --no-bin-links param.

If the function is part of a yarn workspace, this breaks executables across all packages in the workspace and, amongst other things, stops yarn run commands from working in the repo root, which then the amplify cli itself would rely on later during amplify publish.

My current workaround is to run a custom post-install build function to undo the damage done by the default install and do my own build instead.

This is another use case where an option to not yarn install during function builds (or the ability to customize it) would make life easier.

martoncsikos avatar Dec 09 '22 12:12 martoncsikos

Still struggling with this. It would be nice to have a possibility to skip the build step for custom resources (to build them using e.g. turborepo setup) manually prior pushing.

flogy avatar Jan 25 '23 09:01 flogy

Whoever is still struggling with this, it looks like the ability to override the build script executed during push has been silently released in v12.2.0.

https://github.com/aws-amplify/amplify-cli/issues/13107#issuecomment-1677711283

martoncsikos avatar Dec 18 '23 15:12 martoncsikos