retina icon indicating copy to clipboard operation
retina copied to clipboard

feat: Improve controller Dockerfile caching

Open timraymond opened this issue 1 year ago • 2 comments

There's no reason to continually do things like fetching eBPF compilation dependencies, rebuild eBPF, fetch Go dependencies, etc. over and over. Most of these things will not change frequently, so deserve to be aggressively cached by Docker layers. The things that are more likely to change can then have a much faster build process.

This entailed reducing some of the intermediate image sprawl into a single "bins" image so that cache layers could be reused.

timraymond avatar Jul 18 '24 13:07 timraymond

@timraymond can you resolve the merge conflict so we can get this in?

nddq avatar Aug 08 '24 20:08 nddq

@nddq Thanks for the heads up. I wish the merge queue would ping when there's actionable problems for the author to fix.

timraymond avatar Aug 09 '24 17:08 timraymond

@timraymond This would be very useful to everyone - can you resolve the conflict and merge?

rectified95 avatar Aug 21 '24 02:08 rectified95

Ack. Will prioritize

timraymond avatar Aug 21 '24 20:08 timraymond

github says these commits are from a timposter

rbtr avatar Aug 22 '24 23:08 rbtr

@rbtr argh, good catch. New environment and didn't have commit.gpgSign set in the fresh clone. Done.

timraymond avatar Aug 23 '24 15:08 timraymond

RCA of build failures found. Just need an equivalent for windows.

timraymond avatar Aug 26 '24 15:08 timraymond

@timraymond can you attach the before and after build time?

nddq avatar Aug 26 '24 17:08 nddq

Before (luke-warm cache): make retina-image BUILDX_ACTION=--load 3.04s user 1.30s system 2% cpu 2:34.11 total

Before (hot cache): make retina-image BUILDX_ACTION=--load 3.05s user 1.44s system 3% cpu 2:12.67 total

After (luke-warm cache): make retina-image BUILDX_ACTION=--load 2.70s user 0.95s system 8% cpu 45.541 total

After (hot cache): make retina-image BUILDX_ACTION=--load 2.69s user 1.16s system 51% cpu 7.474 total

timraymond avatar Aug 26 '24 17:08 timraymond

Before (luke-warm cache): make retina-image BUILDX_ACTION=--load 3.04s user 1.30s system 2% cpu 2:34.11 total

Before (hot cache): make retina-image BUILDX_ACTION=--load 3.05s user 1.44s system 3% cpu 2:12.67 total

After (luke-warm cache): make retina-image BUILDX_ACTION=--load 2.70s user 0.95s system 8% cpu 45.541 total

After (hot cache): make retina-image BUILDX_ACTION=--load 2.69s user 1.16s system 51% cpu 7.474 total

Building the retina agent image takes 60s for (another 60s for operator) - is your build time 2.70s or 45s?

rectified95 avatar Aug 26 '24 17:08 rectified95

@rectified95 That's the output from time, so total wall clock is all the way to the right. It's about 45s if you need to fetch Go modules. For example, if I change some log messages around (i.e. just the Go logic), rebuild time is about ~10s.

timraymond avatar Aug 26 '24 17:08 timraymond