Consolidate COPY operations into a single layer
What are you trying to accomplish?
Builds on work like https://github.com/dependabot/dependabot-core/pull/13638 and consolidates the copy operations behind a builder
How will you know you've accomplished your goal?
Existing tests pass, we create less number of layers for the image
Checklist
- [x] I have run the complete test suite to ensure all tests and linters pass.
- [x] I have thoroughly tested my code changes to ensure they work as expected, including adding additional tests for new functionality.
- [x] I have written clear and descriptive commit messages.
- [x] I have provided a detailed description of the changes in the pull request, including the problem it addresses, how it fixes the problem, and any relevant details about the implementation.
- [x] I have ensured that the code is well-documented and easy to understand.
Hey @jeffwidman
What do you think about this approach?
I understand we don’t want to consolidate just for the sake of it, but this feels like a neat trick.
The idea came to me while working on #13515. I believe the added complexity is justified, and the extra builder shouldn’t increase build time since we need the base image to be pulled for the final image anyway. 👀
If this looks good, we can apply a similar change to the other ecosystems (or ask Copilot to generate them).
You might want to also take a look at https://github.com/dependabot/dependabot-core/pull/13608 for inspiration... anything that improves the base image will pay dividends across all the ecosystem-level images.
To be clear though, our API call count has dropped drastically after:
- https://github.com/dependabot/dependabot-core/pull/13637
So at this point, I don't think dropping layers just for the sake of dropping layers/API calls is a big deal for us as much.
My personal priorities for how to structure these images in order from higher to lower:
- image size in bytes... fewer bytes is faster all the way around from network to
dockerto the work the Github actions VM has to do... - image performance in
docker pull... fewer layers is nice there as it's simpler fordockerto assemble the image... - image build times - this doesn't matter for the production jobs, but it does matter for all the dev CI pipelines
- image readability
- layer count resulting in fewer API calls to the registry... as long as we're not egregious, I see this as a tie with readability
- layer caching in development... we can always hack this in temporarily while debugging something if we need to keep rebuilding images.
Anyway, I def do think the "builder image" idea has legs... if done right, I'd expect a very slight increase in CI build times but a decrease in docker pull / using the image in production.
But I'd def try to error on benchmarking... no need to get super detailed on it, but just look at the timestamps at CI actions runs, or inspect the built image to see layer counts / byte sizes.
Anyway, just some thoughts...
You might want to also take a look at #13608 for inspiration... anything that improves the base image will pay dividends across all the ecosystem-level images.
To be clear though, our API call count has dropped drastically after:
* [Streamline 34 `COPY` commands into one #13637](https://github.com/dependabot/dependabot-core/pull/13637)So at this point, I don't think dropping layers just for the sake of dropping layers/API calls is a big deal for us as much.
My personal priorities for how to structure these images in order from higher to lower:
* image size in bytes... fewer bytes is faster all the way around from network to `docker` to the work the Github actions VM has to do... * image performance in `docker pull`... fewer layers is nice there as it's simpler for `docker` to assemble the image... * image build times - this doesn't matter for the production jobs, but it does matter for all the dev CI pipelines * image readability * layer count resulting in fewer API calls to the registry... as long as we're not egregious, I see this as a tie with readability * layer caching in development... we can always hack this in temporarily while debugging something if we need to keep rebuilding images.Anyway, I def do think the "builder image" idea has legs... if done right, I'd expect a very slight increase in CI build times but a decrease in
docker pull/ using the image in production.But I'd def try to error on benchmarking... no need to get super detailed on it, but just look at the timestamps at CI actions runs, or inspect the built image to see layer counts / byte sizes.
Anyway, just some thoughts...
Thanks for the feedback and explanation. A few notes:
* image size in bytes... fewer bytes is faster all the way around from network to `docker` to the work the Github actions VM has to do...
Good call, I checked this out too. For example, #13670 was a small change but still a good improvement in size. I’ll look for more chances to do the same. In Maven specifically, I have a few ideas.
* image performance in `docker pull`... fewer layers is nice there as it's simpler for `docker` to assemble the image...
For what it's worth, this pull request removes two layers, though it does introduce additional build complexity depending on how you measure that. Whether that trade-off is acceptable is ultimately up to us to decide.
* image build times - this doesn't matter for the production jobs, but it does matter for all the dev CI pipelines
As you noted, It’s difficult to measure this precisely without historical build metrics, but based on the available builds, this pull request does not appear to have increased build times. In fact, it may have reduced it, though this is hard to prove because Docker caching and other CI-related factors make timing hard. We would need to collect build metrics over a longer period of time to confirm
This pull request(28s): https://github.com/dependabot/dependabot-core/actions/runs/20105340914/job/57687508934 A previous pull request(1m 48s): https://github.com/dependabot/dependabot-core/actions/runs/20091067563/job/57638497981
@jeffwidman In general, what do you think of the merits of this change? Do you think that it should go in?
Or more formally, any chance you can review it and accept/reject it?
I think that finding a review for this will be hard if not you 😃