mastodon icon indicating copy to clipboard operation
mastodon copied to clipboard

Refactor Dockerfile and speed up build times via buildkit caches and layers

Open jippi opened this issue 1 year ago • 22 comments

👋 Hello!

Context

I noticed that in both the mastodon/mastodon and glitch-soc fork that builds can sometimes take hours, usually spending hours of time on work that could be effectively cached.

Since I work with CI/CD and have spent way too many cycles of my life in Dockerfile optimizations, I thought I would give it a shot on improving it for Mastodon, since it would make my own Mastodon server build times faster too 🎉 win/win.

I'm by no means a Ruby on Rails or bundle expert, so might not have aced the usage entirely (I did learn a bit from external resources on how to achieve the caching outcome though).

Changes

I fully recognize (and apologize for) the diff being very hard to read, given the amount of changes, so will do my best to outline changes at a high level here.

Caching

  • Utilize the RUN --mount=type=cache to cache apt-get update and apt-get install operations
  • Utilize the RUN --mount=type=cache to cache bundle install
  • Utilize the RUN --mount=type=cache to cache yarn install
  • Ensure apt caches aren't automatically cleaned by removing the apt script from the base layer

Other

  • Configure TimeZone via TZ (can have huge positive impact on performance!)
  • Sorted packages alphabetically so it's easier to scan
  • Slightly reformatted some Dockerfile statements to be easier to scan
  • Expose some additional ENV as ARG for on-demand changing (useful during testing)
  • Restructured the layers into multiple specific ones (e.g. bundle and yarn). The layering allow each to run in parallel rather than sequential
  • Added comments to most statements in the Dockerfile with external references (docs) and reason for the statement to be there (hopefully useful for future maintainers)

Docker image sizes

This is the output of docker build . from (1) this PR (2) main and (3) DockerHub pull

jippi/mastodon              latest    b41bea14ec78   32 seconds ago      1.18GB
jippi/mastodon-main         latest    edfb04bd73e9   34 minutes ago      1.22GB
ghcr.io/mastodon/mastodon   latest    97e47539355f   28 hours ago        1.42GB

Docker build times

Measured on an M1 Pro w/ 16GB RAM

this branch with no caches: 158.6s
this branch with no bundle caches, warm yarn: 104.2s
this branch with hot caches: 2.9s

jippi avatar Jul 07 '23 19:07 jippi

Intruding here a bit since I also spent quite some time yesterday trying to optimize build time. How will --mount=type=cache improve build times on CI/CD systems, where each run occurs on a fresh instance with no traces from previous runs?

nadiamoe avatar Jul 07 '23 19:07 nadiamoe

@jippi Thanks for your PR. We intend to re-work our Dockerfile soon, and most probably rewrite it pretty much entirely after 4.2.0 is released, as the build chain will change a lot (no need to have Node in the final image for example).

I will try to have a look at your PR in the coming days (weeks?), but you are not the first one to have attempted this, and improvements to the build process have been hard to merge until now.

Intruding here a bit since I also spent quite some time yesterday trying to optimize build time. How will --mount=type=cache improve build times on CI/CD systems, where each run occurs on a fresh instance with no traces from previous runs?

If we use this in the Github action:

cache-from: type=gha
cache-to: type=gha,mode=max

Then Docker caches can be reused between actions.

renchap avatar Jul 07 '23 19:07 renchap

@roobre as @renchap mentioned, https://github.com/moby/buildkit#github-actions-cache-experimental will ensure the layer caching is durable in CI :)

jippi avatar Jul 07 '23 19:07 jippi

@renchap Ah okay, my changes should make the changes you propose very easy to achieve, since everything node is nicely isolated in an intermediate build target, so the final output layer can just use a different "source" than the one currently in place.

The most "noisy" part of the PR is actually splitting things up so each layer can be more "specialized" and run in parallel without stepping on each other. (Which is also why the image from my changes are ~200MB lighter than main Dockerfile, was just a nice side-effect of those changes)

Without knowing the full details of your rewrite, I think my PR would make iterating and modifying it like you mention significantly easier

I would be happy to help adjust those if you want to collaborate.

jippi avatar Jul 07 '23 19:07 jippi

From what we are seeing in the builds, the most costly step (by far) is assets precompiling, and in particular running Webpack.

I am not sure why but it seems to be taking 2+ hours on CI.

I suspect that this is because we are building 2 archs on CI, and building arm64 on an amd64 worker is really not efficient.

If you want to have a further look at this feel free, you should be able to get all the logs from Github Actions.

renchap avatar Jul 07 '23 20:07 renchap

@renchap need to figure out how to run the pipeline in my fork this weekend, I ended up caching some webpack and sprockets data as well, for the heck of it, which seemed to speed those up a bit when making trivially small changes - and making them entirely NOOP when no changes happened.

The CPU emulation is horribly slow, especially on the default small CPU allocations of the free tier. Stepping up one or two sizes have a significant improvement on the emulated CPU performance.

I do wonder if it wouldn't be possbile to build the assets only on amd64 and just copy it to the arm64 image - the asserts should be entirely identical to the browser, maybe something I can dig into optimizing if you would want that :)

jippi avatar Jul 07 '23 20:07 jippi

I do wonder if it wouldn't be possbile to build the assets only on amd64 and just copy it to the arm64 image - the asserts should be entirely identical to the browser, maybe something I can dig into optimizing if you would want that :)

Yes, the assets should be the same. We indeed should be able to run COPY --from=… for the assets, but then it means we have a different Dockerfile for each architecture, which I dont really like.

We could also build the 2 archs in //, but then you need to pull both of them and push them together on the registry, as we want the same tag but with 2 different arch.

If it helps, we can also consider having an ARM runner to build those images, but now we need to coordinate building on multiple runners and then doing a single push…

renchap avatar Jul 07 '23 20:07 renchap

you shouldn't need two different dockerfiles, no? just potentially a multi-stage build with the webpack step occurring in the build platform arch, and the final copying happening in the target platform arch (might be misunderstanding something, though)

davidlougheed avatar Jul 07 '23 20:07 davidlougheed

you shouldn't need two different dockerfiles, no? just potentially a multi-stage build with the webpack step occurring in the build platform arch, and the final copying happening in the target platform arch (might be misunderstanding something, though)

Can you use multiple arch in the same Dockerfile? I did not knew this!

renchap avatar Jul 07 '23 20:07 renchap

yep, this is exactly what I'm thinking - trying that out now :)

jippi avatar Jul 07 '23 20:07 jippi

@renchap here's an example I've worked on where the webpack build step happens on the native arch and the final stage happens on the target arch: https://github.com/bento-platform/bento_web/blob/master/Dockerfile

davidlougheed avatar Jul 07 '23 20:07 davidlougheed

At $DayJob we just configured remote buildx connections for each architecture, so buildx in CI job would coordinate build steps across two native CPU architectures remotely - but it doesn't seem like GitHub Actions makes that easy to do 😭

jippi avatar Jul 07 '23 20:07 jippi

I think that you can do it using https://github.com/docker/setup-buildx-action, they have a section about configuring external builders.

But if the Webpack build is done only on the native platform, I suspect the build times will be much much better.

renchap avatar Jul 07 '23 21:07 renchap

This pull request has merge conflicts that must be resolved before it can be merged.

github-actions[bot] avatar Jul 07 '23 21:07 github-actions[bot]

cache-from: type=gha cache-to: type=gha,mode=max

Then Docker caches can be reused between actions.

My understanding is that this works well for caching layers across runs, but I am not 100% sure about cache mounts. Has this been verified?

But if the Webpack build is done only on the native platform, I suspect the build times will be much much better.

This is possible but would unfortunately require copy-paste into the Dockerfile. I was playing exactly with this yesterday, and was planning to give it a PoC PR tomorrow. I'd happily do so if you think it would be useful before the Dockerfile refactor for 4.2.0 you mentioned :)

nadiamoe avatar Jul 07 '23 22:07 nadiamoe

Honestly, the best way to solve the slow speeds is to either pick a CI provider that provide native arm64/amd64 or increase the GitHub runner size to get increased number of CPU cores on the builders to 4 or 8 for this workload.

The hit of a cold cache will likely still run into "hours" of build time on small 2 core CI runners like GitHub provides for free. And that feels unacceptable slow to me, even if a warm cache hit might be "fast"

I tried for the heck of it to build the file on my Raspberry PI and it wildly outperformed my Apple M1 pro CPU (emulating amd64, on 8 cores)

jippi avatar Jul 07 '23 23:07 jippi

or increase the GitHub runner size to get increased number of CPU cores on the builders to 4 or 8 for this workload.

I don't think this will help a lot. Asset compilation is (for reasons unknown to me, I'm not a ruby ecosystem connoisseur) single-threaded, so at most you may be parallelizing some other steps but the bottleneck won't go away.

pick a CI provider that provide native arm64/amd64

This would be the way, but will require additional effort and maintainer time.

I have a WIP patch that runs the asset build step on the native arch, at the cost of complicating the Dockerfile a bit. This brings down the time from ~1hour (on my workstation) to 11 minutes (also on my workstation), with fully clean buildx caches.

I'll keep testing this to see if it works, and if it does I'll open a PR with the patch. I think the mount cache changes would be very much additive to my changes, so there's more to gain :)

nadiamoe avatar Jul 08 '23 13:07 nadiamoe

More CPU will absolutely help to some degree, the two cores provided by basic runners, combined with webpack, sprockets, qemu and other stuff running is pretty tight, especially considering builkit is running multiple of these in parallel. 4 cores would be great, and from there on, higher CPU frequency all the way - ideally :)

With that being said, upper 90% of the build time is arm compiling assets, any and all other modifications would pale in comparison

Amazing work with the Dockerfile refactor - I dabbled a bit with ARM native build last night, but doesn't look like GitHub Actions provide any native arm64 / aarch64 runners at all - or maybe I missed it? So that path would require an external vendor to be looped in (likely with associated cost?)

If we have native runners, then its pretty easy to get buildx to do all the hard work of building on the right runner arch out of the box with little to no modifications to the Dockerfile itself

jippi avatar Jul 08 '23 13:07 jippi

Asset compilation is (for reasons unknown to me, I'm not a ruby ecosystem connoisseur) single-threaded, so at most you may be parallelizing some other steps but the bottleneck won't go away.

Asset compilation (at least the webpack/webpacker part) is using Webpacker 4, and should scale along the number of cores. If this is not the case, then we may have a webpack configuration issue.

You can run this step specifically with NODE_ENV=production RAILS_ENV=production yarn exec ./bin/webpack

doesn't look like GitHub Actions provide any native arm64 / aarch64 runners at all - or maybe I missed it? They dont

If we have native runners, then its pretty easy to get buildx to do all the hard work of building on the right runner arch out of the box with little to no modifications to the Dockerfile itself

We should be able to run an ARM64 server on Hetzner with 4 CPU/8 GB RAM, if this can help building those steps on a native arch runner.

I never dabbled in this with buildx, how does it work? You configure the runner and specify the arch, and then it is able to automatically run the steps for this platform on this runner, copying everything that is needed seemlessly?

renchap avatar Jul 08 '23 13:07 renchap

Hetzner is a great choice for this - didn't know they had ARM runners hosts now :)

re buildx, yes, basically that - there is some details on it here https://github.com/docker/buildx#building-multi-platform-images

it's a slight learning curve, but fairly well documented and robust once its working - and you can test all of it from your local device by running the commands in your local Docker context, so iteration speed is pretty good for early adoption

Having dedicated remote builders would do wonders for layer and mount caching too, would be a significant speed increase, probably reducing the total build time 85-95% easy

jippi avatar Jul 08 '23 13:07 jippi

The layer caching works reasonably well (without mount caching) - a NOOP commit ran end-to-end in 11s on GitHub runners https://github.com/jippi/mastodon/actions/runs/5494777769/jobs/10013761774

jippi avatar Jul 08 '23 13:07 jippi

Asset compilation (at least the webpack/webpacker part) is using Webpacker 4, and should scale along the number of cores. If this is not the case, then we may have a webpack configuration issue.

Running rails assets:precompile --verbose --trace yields the following, with Execute webpacker:compile taking >95% of the time, which seems to be webpack indeed - so the "more CPUs" would certainly help there - on my local test env, I see 3-4 out of the 8 cores pegged (native, no CPU emulation) for ~20-30s

** Invoke assets:precompile (first_time)
** Invoke assets:environment (first_time)
** Execute assets:environment
** Invoke environment (first_time)
** Execute environment
** Invoke yarn:install (first_time)
** Execute yarn:install
yarn install v1.22.19
[1/6] Validating package.json...
[2/6] Resolving packages...
success Already up-to-date.
Done in 0.38s.
** Execute assets:precompile
** Invoke webpacker:compile (first_time)
** Invoke webpacker:verify_install (first_time)
** Invoke webpacker:check_node (first_time)
** Execute webpacker:check_node
** Invoke webpacker:check_yarn (first_time)
** Execute webpacker:check_yarn
** Invoke webpacker:check_binstubs (first_time)
** Execute webpacker:check_binstubs
** Execute webpacker:verify_install
** Invoke environment 
** Execute webpacker:compile
Compiling...
Compiled all packs in /opt/mastodon/public/packs
`isModuleDeclaration` has been deprecated, please migrate to `isImportOrExportDeclaration`
    at isModuleDeclaration (/opt/mastodon/node_modules/@babel/types/lib/validators/generated/index.js:2740:35)
    at PluginPass.Program (/opt/mastodon/node_modules/babel-plugin-lodash/lib/index.js:102:44)

** Invoke assets:generate_static_pages (first_time)
** Invoke assets:environment 
** Execute assets:generate_static_pages

jippi avatar Jul 08 '23 14:07 jippi

@jippi Thanks a lot for your work on this.

Do you think you can optimise the build further? If you think this is ready for review, could you clean up the PR so it is ready for merge?

renchap avatar Jul 16 '23 20:07 renchap

Few suggestions on a image size.

Before:

950M /opt/mastodon/

  1. Assets source files can be removed after build, as they will be compiled to public directory and not needed in production deployment
  • /opt/mastodon/node_modules: dependencies, only needed during webpack build, ~425Mb
  • /opt/mastodon/app/javascript: source assets, only needed during webpack build, ~18Mb
  • /opt/mastodon/tmp: assets build leftover: 3Mb
  • /opt/mastodon/yarn*/: yarn configuration and log: 1Mb
  1. bundled gems can be cleaned up:
  • find /opt/mastodon/vendor/bundle/ruby/*/gems/ -name "*.o": native extensions object files, 128Mb
  • find /opt/mastodon/vendor/bundle/ruby/*/gems/ -name "*.c": native extensions source files, 6Mb
  • find /opt/mastodon/vendor/bundle/ruby/*/cache/ -name "*.gem": gem source cache, 20Mb
  1. /opt/mastodon/specs: tests are non needed in production: 11Mb
  2. /tmp: general temporary files: 31Mb

After:

257M /opt/mastodon/

Total: 693MB, which are massive bandwidth savings

y8 avatar Jul 22 '23 14:07 y8

Oh, I've completely missed that streaming is a node express app, but it's a part of main package.json.

It makes sense to bundle its own streaming/package.json. I've did a quick research, by extracting it's dependencies:

{
  "name": "@mastadon/streaming",
  "main": "index.js",
  "license": "AGPL-3.0-or-later",
  "scripts": {
    "start": "node index.js"
  },
  "engines": {
    "node": ">=16"
  },
  "dependencies": {
    "dotenv": "^16.0.3",
    "express": "^4.18.2",
    "jsdom": "^22.1.0",
    "npmlog": "^7.0.1",
    "pg": "^8.5.0",
    "pg-connection-string": "^2.6.0",
    "redis": "^4.6.5",
    "uuid": "^9.0.0",
    "ws": "^8.12.1"
  }
}

After npm install I got

20M streaming/node_modules/

So streaming server dependencies are just 20Mb comparing to 425Mb for frontend part. Seems like only few changes are required:

  • move dependencies to streaming/package.json
  • remove dependencies from ./package.json
  • change start script definition to node ./streaming
  • invoke npm install during build process

I'm not familiar with mastodon build chain, so I'm not sure where last step should be added.

y8 avatar Jul 22 '23 16:07 y8

One more thought: resource precompilation is architecture agnostic. Only gems and the streaming server have native extensions that need to be built for a specific arch.

@jippi can this fact be leveraged somehow during the build process?

y8 avatar Jul 22 '23 21:07 y8

It makes sense to bundle its own streaming/package.json. I've did a quick research, by extracting it's dependencies:

This is our plan, once 4.2.0 is released.

renchap avatar Jul 23 '23 15:07 renchap

I did quick research based on this pull request: https://github.com/y8/mastodon/commit/024d596da3c97ae105e06b319fc45466335f32db

Raw image size: -687Mb (~39% smaller)

y8/mastodon              small       2d850894ace1   2 hours ago         1.01GB
jippi/mastodon           small       e52416b297b5   About an hour ago   1.68GB

Compressed image: -217Mb (~30%)

y8/mastodon:small       480.01 MB
jippi/mastodon:small    697.52 MB

Moving streaming dependencies was quite easy:

  1. Add dependencies to ./streaming/package.json

  2. Remove dependencies from main package.json using yarn remove

  3. Add extra step in image build process to install dependencies after assets are built and node_modules is removed

    • remove ./package.json, otherwise npm will try to use it to resolve dependencies
    • call npm install inside ./streaming/ folder

I'm not sure about last step, maybe there is a better way to handle dependencies for nodejs apps.

y8 avatar Jul 23 '23 16:07 y8

There is already a PR for splitting the streaming server: https://github.com/mastodon/mastodon/pull/24702

As said above, we will work on merging this once 4.2.0 is released (in a couple of weeks hopefully).

renchap avatar Jul 23 '23 16:07 renchap

Oh, sorry, browser tab was open since yesterday and I haven't seen your comment about #24702. That's great news!

With that in mind, I've tried my best to follow @jippi's idea of layer organization, without affecting streaming server, but cleaning gems and removing assets sources after recompilation. Here a Pull Request: https://github.com/jippi/mastodon/pull/1

Steps taken:

  1. Removed unnecessary caching step in bundle-layer
  2. Introduced new intermediate cleanup-layer where artifacts from all previous layer are copied and then cleaned.
  3. Introduced layer for future streaming server build process, that allows to build it in parallel with assets. It's NOOP right now and doesn't affect current build process.
  4. node_modules are still copied to cleanup-layer, but now as separate step, that can be easily removed when #24702 is merged
  5. final-layer now copies slimmed down version of /opt/mastodon from cleanup-layer to ensure that final image is as small as possible

This way minimal changes are required after #24702 merge.

Even with node_modules it's still saves ~470MB (~27%) for uncompressed image:

y8/mastodon               without-gem-cache      8185a5799a55   8 minutes ago    1.21GB
jippi/mastodon            latest                 e52416b297b5   30 minutes ago   1.68GB

And ~180Mb (~25%) for compressed image:

y8/mastodon           without-gem-cache          516.77 MB
jippi/mastodon        latest                     697.52 MB

After #24702 is merged this can be further reduced to ~990MB uncompressed and 470MB compressed.

y8 avatar Jul 23 '23 19:07 y8