fedora-coreos-pipeline icon indicating copy to clipboard operation
fedora-coreos-pipeline copied to clipboard

`secex-data` volume hack is subject to races

Open jlebon opened this issue 1 year ago • 6 comments

On non-s390x, we rely on the --volume=secex-data:/data.secex:ro switch we pass to podman in cosa remote-session create to just create an empty volume. This logic though is subject to races if we're creating multiple remote sessions onto the same non-s390x builder:

Error: creating named volume "secex-data": adding volume to state: name secex-data is in use: volume already exists

(In this case, this happened in the bump-lockfile job, which often gets executed in parallel for testing-devel and next-devel.)

Probably the simplest fix for this is to have it created at provisioning time. That way it's consistent with s390x too.

jlebon avatar Mar 13 '23 16:03 jlebon

I wonder if there is a race condition in podman itself here that they would be interested to fix.

dustymabe avatar Apr 17 '23 17:04 dustymabe

While this does seem like a podman issue, for now as a workaround we could apply the same method we use on the s390x builder to keep the volume from beeing garbage collected in the first place. This would go against the point of why we are using the volume in the first place, that is to prevent doing any configuration on other builders that is only really needed on one arch. But it should prevent this issue, as the volume would always be available. https://github.com/coreos/fedora-coreos-pipeline/blob/dbc32b2a96f26e42e4eb4dbf1220e43ecea0bdd2/multi-arch-builders/coreos-s390x-rhcos-builder.bu#L99-L118

jschintag avatar Apr 19 '23 08:04 jschintag

We'll be able to remove that "keep from being garbage collected" debt very soon as the fix for https://github.com/containers/podman/issues/17051 is in podman 4.5.0, which is already in next and will be in testing and stable within a month so we could put off fixing this until we drop that tech debt.

@jschintag want to confirm the fix for https://github.com/containers/podman/issues/17051 works as you expected?

dustymabe avatar Apr 20 '23 03:04 dustymabe

I tested it and it works.

jschintag avatar Apr 20 '23 08:04 jschintag

@jschintag is original issue described by this ticket still an issue?

dustymabe avatar Jul 20 '23 14:07 dustymabe

I mean as this is for non-s390x architectures, i would say yes, this could still happen. I did not hear anything about the race condition being fixed for podman. Did we ever even create a Issue over at https://github.com/containers/podman/issues for this?

jschintag avatar Jul 21 '23 07:07 jschintag