serverless-ruby-layer icon indicating copy to clipboard operation
serverless-ruby-layer copied to clipboard

Proposal to speed up repeated Docker based deploys

Open jagthedrummer opened this issue 3 years ago • 1 comments

The Problem

When deploying with Docker we end up installing all the gems from scratch each time, even if the Gemfile hasn't changed. On my smallish demo project the bundle install step inside the running container takes about 90 seconds.

Proposed solution

If we moved the bundle install step (and the bundle config set ... steps) into the Dockerfile then we could take advantage of Docker layer caching and we'd only have to install gems when the Gemfile changes.

I've tested a very crude version of this change and in my demo project a second full deploy (without changes to Gemfile) takes about 2 min 30 seconds compared to it taking about 4 minutes with the currently implemented approach.

Complicating factors

Since we currently allow people to choose between using their own Dockerfile or an auto-generated one we'd need to handle both cases, and we'd have to require people to handle the bundle config and bundle install steps in their own Dockerfile. That would be non-backwards-compatible breaking change, so we'd probably want to consider bumping the MAJOR portion of the version number (bump to 2.0.0).

Example Dockerfile

Here's an example Dockerfile that is currently working with my demo project (and my hacked up modification to serverless-ruby-layer). The bits about creating and moving into /var/gem_build are there to accommodate other parts of the existing implementation, but since the resulting image won't be distributed for production it may not be necessary to do things in that directory, and we might be able to use the default working directory of /var/task just to keep things simple.

FROM lambci/lambda:build-ruby2.7 AS base

RUN yum install -y postgresql-devel

RUN gem update bundler

RUN mkdir /var/gem_build

WORKDIR /var/gem_build

RUN bundle config set --local path build
RUN bundle config set --local without test development

COPY Gemfile* .

RUN bundle install

CMD "/bin/bash"

And here's the output of a first build taking almost 2 minutes:

$ time docker build .
[+] Building 113.2s (15/15) FINISHED
 => [internal] load build definition from Dockerfile                                                            0.0s
 => => transferring dockerfile: 363B                                                                            0.0s
 => [internal] load .dockerignore                                                                               0.0s
 => => transferring context: 2B                                                                                 0.0s
 => [internal] load metadata for docker.io/lambci/lambda:build-ruby2.7                                          0.0s
 => CACHED [ 1/10] FROM docker.io/lambci/lambda:build-ruby2.7                                                   0.0s
 => [internal] load build context                                                                               0.0s
 => => transferring context: 61B                                                                                0.0s
 => [ 2/10] RUN echo "yo!"                                                                                      0.4s
 => [ 3/10] RUN yum install -y postgresql-devel                                                                13.5s
 => [ 4/10] RUN gem update bundler                                                                             12.2s
 => [ 5/10] RUN mkdir /var/gem_build                                                                            0.4s
 => [ 6/10] WORKDIR /var/gem_build                                                                              0.0s
 => [ 7/10] RUN bundle config set --local path build                                                            0.5s
 => [ 8/10] RUN bundle config set --local without test development                                              0.4s
 => [ 9/10] COPY Gemfile* .                                                                                     0.0s
 => [10/10] RUN bundle install                                                                                 84.0s
 => exporting to image                                                                                          1.5s
 => => exporting layers                                                                                         1.5s
 => => writing image sha256:b3e4ef459cc93a9a4d754447d6225b581a01a349f6507cc620fac2994aa62fe0                    0.0s

Use 'docker scan' to run Snyk tests against images to find vulnerabilities and learn how to fix them

real    1m54.092s
user    0m0.449s
sys     0m0.394s

And the output of a second build taking less than 1 second:

$ time docker build .
[+] Building 0.1s (15/15) FINISHED
 => [internal] load build definition from Dockerfile                                                            0.0s
 => => transferring dockerfile: 37B                                                                             0.0s
 => [internal] load .dockerignore                                                                               0.0s
 => => transferring context: 2B                                                                                 0.0s
 => [internal] load metadata for docker.io/lambci/lambda:build-ruby2.7                                          0.0s
 => [ 1/10] FROM docker.io/lambci/lambda:build-ruby2.7                                                          0.0s
 => [internal] load build context                                                                               0.0s
 => => transferring context: 61B                                                                                0.0s
 => CACHED [ 2/10] RUN echo "yo!"                                                                               0.0s
 => CACHED [ 3/10] RUN yum install -y postgresql-devel                                                          0.0s
 => CACHED [ 4/10] RUN gem update bundler                                                                       0.0s
 => CACHED [ 5/10] RUN mkdir /var/gem_build                                                                     0.0s
 => CACHED [ 6/10] WORKDIR /var/gem_build                                                                       0.0s
 => CACHED [ 7/10] RUN bundle config set --local path build                                                     0.0s
 => CACHED [ 8/10] RUN bundle config set --local without test development                                       0.0s
 => CACHED [ 9/10] COPY Gemfile* .                                                                              0.0s
 => CACHED [10/10] RUN bundle install                                                                           0.0s
 => exporting to image                                                                                          0.0s
 => => exporting layers                                                                                         0.0s
 => => writing image sha256:b3e4ef459cc93a9a4d754447d6225b581a01a349f6507cc620fac2994aa62fe0                    0.0s

Use 'docker scan' to run Snyk tests against images to find vulnerabilities and learn how to fix them

real    0m0.661s
user    0m0.193s
sys     0m0.139s

Lingering questions

Maybe there's some reason that I'm not aware of that doing bundle install as a Dockerfile layer won't work, or is problematic? Are there advantages to doing it after the container is built instead of as part of the container build process?

@navarasu If this sounds like an approach that you're open to investigating I'd be happy to submit a PR.

jagthedrummer avatar Jul 22 '21 15:07 jagthedrummer

@jagthedrummer

Regarding improving the performance of repeated work, I have other ideas like,

  • Skip Repeated Deployment: It is not only building time even deploying layer is slower. So i planned to add skip deploy layer option
  • Deploy only on any change Finally deploy only if there is change in gemfile.
  • Alternatively we can also cache the gemlayer zip itself like in other serverless plugins

I like the caching docker idea but it does not support non docker use cases.

Assuming that most of them uses docker option, we can include this. But we need to see how to implement this without breaking existing use case.

After releasing the current version, I will come up with plan to add this without breaking exiting use case or implement no deploy option directly or both option with some config control

navarasu avatar Jul 30 '21 22:07 navarasu