podman icon indicating copy to clipboard operation
podman copied to clipboard

minikube flakes

Open edsantiago opened this issue 2 years ago • 3 comments

No useful diagnostics:

# X Exiting due to RUNTIME_ENABLE: Failed to enable container runtime: sudo systemctl restart cri-docker.socket: Process exited with status 1

Smells like the quay flakiness, but there's nothing to go on. Tests should probably be instrumented to run journalctl, minikube logs, and anything else that could give a user some hints.

  • fedora-39 : minikube podman fedora-39 rootless host sqlite
    • PR #21908
      • 03-01 13:52 in [minikube] minikube - check cluster is up
    • PR #21862
      • 02-29 14:12 in [minikube] [001] minikube - deploy generated container yaml to minikube
    • PR #21601
      • 02-28 14:12 in [minikube] minikube - check cluster is up
x x x x x x
minikube(3) podman(3) fedora-39(3) rootless(3) host(3) sqlite(3)

edsantiago avatar Mar 04 '24 12:03 edsantiago

Weird that already the cri-docker.socket fails, you would think it would wait until cri-docker.service

https://github.com/Mirantis/cri-dockerd/tree/master/packaging/systemd

afbjorklund avatar Mar 04 '24 13:03 afbjorklund

Caught one:

<+010ms> # $ minikube kubectl -- apply -f /tmp/minikube_deploy_SEeITt.yaml
<+593ms> # pod/test-ctr-pod created
         #
<+023ms> # $ minikube kubectl get pods
<+266ms> # NAME           READY   STATUS              RESTARTS   AGE
         # test-ctr-pod   0/1     ContainerCreating   0          0s
....
<+1.03s> # $ minikube kubectl get pods
<+232ms> # NAME           READY   STATUS         RESTARTS   AGE
         # test-ctr-pod   0/1     ErrImagePull   0          18s        <<<<<<<<<<<<<--------------------------------
....
<+1.03s> # $ minikube kubectl get pods
<+265ms> # NAME           READY   STATUS             RESTARTS   AGE
         # test-ctr-pod   0/1     ImagePullBackOff   0          30s       <<<<<<<<<<<<------------------
....
         # #/vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
         # #| FAIL: Timed out waiting for pod to move to 'Running' state
         # #\^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

"ErrImagePull" smells to me like quay flake. Anyone know for sure?

edsantiago avatar Apr 02 '24 19:04 edsantiago

I believe I hit this using a "new" F40 image (still under test) or possibly a new flake?

https://api.cirrus-ci.com/v1/artifact/task/5131719219609600/html/minikube-podman-fedora-40-rootless-host-sqlite.log.html

The output seems similar to a previous (above) hit on:

You are using the QEMU driver without a dedicated network, which doesn't support minikube service&minikube tunnel commands.

I don't think that test should be trying to use QEMU, but maybe that's a red herring? In any case, I re-ran the task and it passed.

cevich avatar May 06 '24 17:05 cevich

This is starting to compete with #22551 for the Most Annoying Flake award.

  • fedora-39 : minikube podman fedora-39 rootless host sqlite
    • PR #22389
    • PR #22370
      • 04-15-2024 14:03 in [minikube] [001] minikube - deploy generated container yaml to minikube
    • PR #22171
    • PR #22140
      • 03-22-2024 14:07 in [minikube] [001] minikube - deploy generated container yaml to minikube
      • 03-22-2024 14:05 in [minikube] [001] minikube - deploy generated container yaml to minikube
    • PR #22082
      • 03-18-2024 22:04 in [minikube] [001] minikube - deploy generated container yaml to minikube
    • PR #22081
    • PR #21979
    • PR #21960
      • 03-06-2024 10:43 in [minikube] [001] minikube - deploy generated container yaml to minikube
    • PR #21908
    • PR #21862
      • 02-29-2024 14:12 in [minikube] [001] minikube - deploy generated container yaml to minikube
    • PR #21601
  • fedora-40 : minikube podman fedora-40 rootless host sqlite
    • PR #22715
      • 05-15 07:03 in [minikube] minikube - check cluster is up
    • PR #22673
      • 05-11 14:48 in [minikube] minikube - check cluster is up
    • PR #22662
      • 05-13 04:19 in [minikube] minikube - check cluster is up
      • 05-10 13:32 in [minikube] minikube - check cluster is up
      • 05-09 17:24 in [minikube] minikube - check cluster is up
    • PR #22658
      • 05-13 16:43 in [minikube] minikube - check cluster is up
      • 05-13 16:03 in [minikube] minikube - check cluster is up
    • PR #22549
      • 05-06 16:01 in [minikube] minikube - check cluster is up
x x x x x x
minikube(20) podman(20) fedora-39(12) rootless(20) host(20) sqlite(20)
fedora-40(8)

edsantiago avatar May 15 '24 12:05 edsantiago

Maybe worth asking Urvashi to take a look? IIRC she wrote these tests, and might have a quick/easy answer.

cevich avatar May 21 '24 18:05 cevich

FWIW, I attempted to reproduce this in a hack/get_ci_vm.sh environment. Painstakingly copy-pasting commands in the code-path one-by-one. This worked perfectly fine for minikube - check cluster is up and minikube - deploy generated container yaml to minikube. I was hoping to get lucky and it would reproduce for me given how seemingly often it breaks :cry: So I'm giving up.

Ref: https://github.com/containers/podman/pull/23237

cevich avatar Jul 09 '24 18:07 cevich