test-plans Re-consider container caching strategy to use registries

Whilst working on the interop-tests and taking a closer look at our container caching in general, I had the following thoughts:

The current caching mechanism introduces a fair bit of complexity:

~ 20 lines of Makefiles per version: https://github.com/libp2p/test-plans/blob/master/multidim-interop/impl/rust/v0.51/Makefile
Custom logic of computing the cache key and up/downloading the cache to S3: https://github.com/libp2p/test-plans/blob/master/multidim-interop/helpers/cache.ts
The above script needs to be manually called: https://github.com/libp2p/test-plans/blob/b8235c9eae35670a9aa4edf62b16119ea506cb2e/.github/actions/run-interop-ping-test/action.yml#L79-L104
We need to manage AWS credentials: https://github.com/libp2p/test-plans/blob/b8235c9eae35670a9aa4edf62b16119ea506cb2e/.github/actions/run-interop-ping-test/action.yml#L39-L43
We need to set custom caching options for each docker build: https://github.com/libp2p/test-plans/blob/master/multidim-interop/dockerBuildWrapper.sh
I am about to duplicate all of the above for the hole-punching tests :sob:

The benefit we are getting from this is that we can pretty much plug any commit of a repository into the Makefile, hit make and we end up with a container. The cache is thus purely a performance optimisation.

Is this worth the complexity? Yesterday, I discovered a subtle bug in our setup that made us not cache a particular Rust image, see https://github.com/libp2p/test-plans/issues/301.

The test runner is already designed to work in phases:

Generate the permutations of test cases
Generate a docker-compose.yml file
Run a particular docker-compose.yml file

If we would use container registries instead, we could delete all of the above code by just referencing image IDs in the versions.ts file.

When debugging code, we could always generate the offending docker-compose.yml file first and swap the container reference out to a point at a Dockerfile instead which would build a local container instead. https://github.com/libp2p/test-plans/issues/282 already hints at this too.

Currently, the "contract" between libp2p/test-plans and the repositories is that the need to provide a Makefile which builds a container. The new contract would be that they need to provide a Dockerfile that builds a functioning version. This would likely be more useful because not every developer has all build-environments of other languages set up.

Have I missed anything? Opinions welcome.

Sep 13 '23 23:09 thomaseizinger

cc @mxinden @achingbrain @marten-seemann

Sep 14 '23 21:09 thomaseizinger

Sorry for the late reply.

Generally I am in favor of the move to container registries for their simplicity.

That said, I don't have the capacity / priority today to change the existing transport interop caching strategy, nor to review all the necessary changes.

Thus, unless we have two people owning this change, I suggest keeping it as is.

What are other people's thoughts?

Oct 16 '23 07:10 mxinden

That said, I don't have the capacity / priority today to change the existing transport interop caching strategy, nor to review all the necessary changes.

Thus, unless we have two people owning this change, I suggest keeping it as is.

I am happy to implement the changes. Review should be trivial as we are just deleting code. Docker-compose implicitly pulls containers so we don't even have to do anything to make it work with registry-hosted images.

Oct 16 '23 11:10 thomaseizinger

When debugging code, we could always generate the offending docker-compose.yml file first and swap the container reference out to a point at a Dockerfile instead which would build a local container instead. #282 already hints at this too.

FWIW, in https://github.com/libp2p/test-plans/pull/304 I adopted this design and always generate the docker-compose.yml file as a build artifact. Thus, it is trivial to modify it to build a container instead.

Oct 16 '23 11:10 thomaseizinger

I think the caching strategy was implemented as-is because building the artefacts required to run the tests was previously very slow - if memory serves it was over 25 minutes on GH CI.

I’m not against any of the proposed changes as long as it doesn’t increase the time taken to run the test suite.

Oct 16 '23 14:10 achingbrain

I’m not against any of the proposed changes as long as it doesn’t increase the time taken to run the test suite.

The idea is to decrease the time even further. At the moment, all images are pulled from the cache in sequence whereas I am pretty sure, docker compose would pull images in parallel.

I think the caching strategy was implemented as-is

From memory, what was important to @MarcoPolo is that everything is reproducible from a given commit. See also the design paragraph at the top in https://github.com/libp2p/test-plans/tree/master/transport-interop#transport-interoperability-tests.

Whilst I think that is a novel goal, it isn't fully true anyway. For example, we still depend on all sorts of software being installed and we don't pin the version of these tools. Referencing docker images by hash gives us similarish guarantees. We can still build a docker container from the exact same Git hash. It might not be bit-for-bit the same container but we also don't have this guarantee at the moment.

Oct 17 '23 00:10 thomaseizinger

test-plans test-plans copied to clipboard

Re-consider container caching strategy to use registries

test-plans
test-plans copied to clipboard