fedora-coreos-pipeline
fedora-coreos-pipeline copied to clipboard
`secex-data` volume hack is subject to races
On non-s390x, we rely on the --volume=secex-data:/data.secex:ro
switch we pass to podman in cosa remote-session create
to just create an empty volume. This logic though is subject to races if we're creating multiple remote sessions onto the same non-s390x builder:
Error: creating named volume "secex-data": adding volume to state: name secex-data is in use: volume already exists
(In this case, this happened in the bump-lockfile
job, which often gets executed in parallel for testing-devel
and next-devel
.)
Probably the simplest fix for this is to have it created at provisioning time. That way it's consistent with s390x too.
I wonder if there is a race condition in podman
itself here that they would be interested to fix.
While this does seem like a podman issue, for now as a workaround we could apply the same method we use on the s390x builder to keep the volume from beeing garbage collected in the first place. This would go against the point of why we are using the volume in the first place, that is to prevent doing any configuration on other builders that is only really needed on one arch. But it should prevent this issue, as the volume would always be available. https://github.com/coreos/fedora-coreos-pipeline/blob/dbc32b2a96f26e42e4eb4dbf1220e43ecea0bdd2/multi-arch-builders/coreos-s390x-rhcos-builder.bu#L99-L118
We'll be able to remove that "keep from being garbage collected" debt very soon as the fix for https://github.com/containers/podman/issues/17051 is in podman 4.5.0
, which is already in next
and will be in testing
and stable
within a month so we could put off fixing this until we drop that tech debt.
@jschintag want to confirm the fix for https://github.com/containers/podman/issues/17051 works as you expected?
I tested it and it works.
@jschintag is original issue described by this ticket still an issue?
I mean as this is for non-s390x architectures, i would say yes, this could still happen. I did not hear anything about the race condition being fixed for podman. Did we ever even create a Issue over at https://github.com/containers/podman/issues for this?