sbt-native-packager
sbt-native-packager copied to clipboard
Random LABEL snp-multi-stage-id invalidates docker cache
Expected behaviour
I'm currently writing second post about docker cache efficiency on SBT. My main focus is to use docker cache in CI environments as much as possible. Already managed great improvements but there is problem with random generated Labels.
LABEL snp-multi-stage="intermediate"
LABEL snp-multi-stage-id="44857d33-aef2-4d80-b811-7d1ed9b1891d"
Executing second command always invalidates cache which is not always expected.
Especially when dockerAutoremoveMultiStageIntermediateImages := false
is used.
I suggest to use something deterministic like stage0-(packageName in Docker).value
Actual behaviour
Step 1/20 : FROM repo.mycompany.com/team/openjre:8u242 as stage0
[info] ---> a020ce624573
[info] Step 2/20 : LABEL snp-multi-stage="intermediate"
[info] ---> Using cache
[info] ---> 9abeed0b5a9f
[info] Step 3/20 : LABEL snp-multi-stage-id="44857d33-aef2-4d80-b811-7d1ed9b1891d"
[info] ---> Running in 072dfdce55ad
[info] Removing intermediate container 072dfdce55ad
[info] ---> 554d25368747
[info] Step 4/20 : WORKDIR /opt/my-app
[info] ---> Running in 49f3e275dc8c
As you can see cache works with the first label, but gets invalidated after second, random label. @mkurz What do you think about deterministic label?
@ppiotrow I think I can live with a deterministic label. Actually there was a discussion already if we should use a random id (like we do now) or something more deterministic. Please have a look the comment here and also my answer. As you can see my main argument was that I wanto to avoid any side effects if possible. E.g. creating an image fails and a user may want to inspect it later, however if you now run another build and that succeeds, with a deterministic label, it will also delete the previous build image, which we wanted to keep actually. However, I think it would be a compromise to switch to a deterministic label for caching purposes if the win is much higher, performance and disc space wise. WDYT? Will it be worth it?
I like the existing idea to have two layers: snp-multi-stage
to wipe out all intermediate layers from sbt docker builds and second snp-multi-stage-id
to handle only build specific image.
I don't really follow the argument of inspecting image later, but this is influenced by my environment. I usually run builds in docker in docker CI servers. Unpushed images are just gone to me. But if I run build locally, I'd inspect it just after it fails.
The caching capabilities, having reproducible (non random) builds, simpler unit tests is better from my point of view. I'd like to learn someone else with different CI setup opinion.
@ppiotrow Let's just change snp-multi-stage-id
to something deterministic. I am fine with that. However I will not do that work, too busy right now.
If you can live without those labels, it's possible to simply remove then as a workaround
dockerCommands := dockerCommands.value.filter {
case Cmd("LABEL", args @ _*) => args.head.startsWith("snp-multi-stage")
case _ => true
}
If you can live without those labels, it's possible to simply remove then as a workaround
dockerCommands := dockerCommands.value.filter { case Cmd("LABEL", args @ _*) => args.head.startsWith("snp-multi-stage") case _ => true }
Thanks for the workaround! There's just a negation missing, it should be:
case Cmd("LABEL", args @ _*) => !args.head.startsWith("snp-multi-stage")
In general I believe a deterministic id should be the default. More users are concerned with a fast build compared to ones inspecting their failed builds.