serverless-ruby-layer
serverless-ruby-layer copied to clipboard
Proposal to speed up repeated Docker based deploys
The Problem
When deploying with Docker we end up installing all the gems from scratch each time, even if the Gemfile
hasn't changed. On my smallish demo project the bundle install
step inside the running container takes about 90 seconds.
Proposed solution
If we moved the bundle install
step (and the bundle config set ...
steps) into the Dockerfile
then we could take advantage of Docker layer caching and we'd only have to install gems when the Gemfile
changes.
I've tested a very crude version of this change and in my demo project a second full deploy (without changes to Gemfile
) takes about 2 min 30 seconds compared to it taking about 4 minutes with the currently implemented approach.
Complicating factors
Since we currently allow people to choose between using their own Dockerfile
or an auto-generated one we'd need to handle both cases, and we'd have to require people to handle the bundle config
and bundle install
steps in their own Dockerfile
. That would be non-backwards-compatible breaking change, so we'd probably want to consider bumping the MAJOR
portion of the version number (bump to 2.0.0
).
Example Dockerfile
Here's an example Dockerfile
that is currently working with my demo project (and my hacked up modification to serverless-ruby-layer
). The bits about creating and moving into /var/gem_build
are there to accommodate other parts of the existing implementation, but since the resulting image won't be distributed for production it may not be necessary to do things in that directory, and we might be able to use the default working directory of /var/task
just to keep things simple.
FROM lambci/lambda:build-ruby2.7 AS base
RUN yum install -y postgresql-devel
RUN gem update bundler
RUN mkdir /var/gem_build
WORKDIR /var/gem_build
RUN bundle config set --local path build
RUN bundle config set --local without test development
COPY Gemfile* .
RUN bundle install
CMD "/bin/bash"
And here's the output of a first build taking almost 2 minutes:
$ time docker build .
[+] Building 113.2s (15/15) FINISHED
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 363B 0.0s
=> [internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> [internal] load metadata for docker.io/lambci/lambda:build-ruby2.7 0.0s
=> CACHED [ 1/10] FROM docker.io/lambci/lambda:build-ruby2.7 0.0s
=> [internal] load build context 0.0s
=> => transferring context: 61B 0.0s
=> [ 2/10] RUN echo "yo!" 0.4s
=> [ 3/10] RUN yum install -y postgresql-devel 13.5s
=> [ 4/10] RUN gem update bundler 12.2s
=> [ 5/10] RUN mkdir /var/gem_build 0.4s
=> [ 6/10] WORKDIR /var/gem_build 0.0s
=> [ 7/10] RUN bundle config set --local path build 0.5s
=> [ 8/10] RUN bundle config set --local without test development 0.4s
=> [ 9/10] COPY Gemfile* . 0.0s
=> [10/10] RUN bundle install 84.0s
=> exporting to image 1.5s
=> => exporting layers 1.5s
=> => writing image sha256:b3e4ef459cc93a9a4d754447d6225b581a01a349f6507cc620fac2994aa62fe0 0.0s
Use 'docker scan' to run Snyk tests against images to find vulnerabilities and learn how to fix them
real 1m54.092s
user 0m0.449s
sys 0m0.394s
And the output of a second build taking less than 1 second:
$ time docker build .
[+] Building 0.1s (15/15) FINISHED
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 37B 0.0s
=> [internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> [internal] load metadata for docker.io/lambci/lambda:build-ruby2.7 0.0s
=> [ 1/10] FROM docker.io/lambci/lambda:build-ruby2.7 0.0s
=> [internal] load build context 0.0s
=> => transferring context: 61B 0.0s
=> CACHED [ 2/10] RUN echo "yo!" 0.0s
=> CACHED [ 3/10] RUN yum install -y postgresql-devel 0.0s
=> CACHED [ 4/10] RUN gem update bundler 0.0s
=> CACHED [ 5/10] RUN mkdir /var/gem_build 0.0s
=> CACHED [ 6/10] WORKDIR /var/gem_build 0.0s
=> CACHED [ 7/10] RUN bundle config set --local path build 0.0s
=> CACHED [ 8/10] RUN bundle config set --local without test development 0.0s
=> CACHED [ 9/10] COPY Gemfile* . 0.0s
=> CACHED [10/10] RUN bundle install 0.0s
=> exporting to image 0.0s
=> => exporting layers 0.0s
=> => writing image sha256:b3e4ef459cc93a9a4d754447d6225b581a01a349f6507cc620fac2994aa62fe0 0.0s
Use 'docker scan' to run Snyk tests against images to find vulnerabilities and learn how to fix them
real 0m0.661s
user 0m0.193s
sys 0m0.139s
Lingering questions
Maybe there's some reason that I'm not aware of that doing bundle install
as a Dockerfile
layer won't work, or is problematic? Are there advantages to doing it after the container is built instead of as part of the container build process?
@navarasu If this sounds like an approach that you're open to investigating I'd be happy to submit a PR.
@jagthedrummer
Regarding improving the performance of repeated work, I have other ideas like,
-
Skip Repeated Deployment:
It is not only building time even deploying layer is slower. So i planned to add
skip deploy layer
option - Deploy only on any change Finally deploy only if there is change in gemfile.
- Alternatively we can also cache the gemlayer zip itself like in other serverless plugins
I like the caching docker idea but it does not support non docker use cases.
Assuming that most of them uses docker option, we can include this. But we need to see how to implement this without breaking existing use case.
After releasing the current version, I will come up with plan to add this without breaking exiting use case or implement no deploy option directly or both option with some config control