DeepSea [CI] Stage 3 fails transiently at ceph.packages

[CI] Stage 3 fails transiently at ceph.packages

Open smithfarm opened this issue 5 years ago • 0 comments

Description of Issue/Question

We have a transient CI failure. It looks like this:

2019-02-28T14:18:42.269 INFO:teuthology.orchestra.run.target192168000033:Running: "sudo bash -c 'DEV_ENV=true timeout 60m deepsea --log-file=/var/log/salt/deepsea.log --log-level=debug salt-run state.orch ceph.stage.3 --simple-output'"
2019-02-28T14:18:43.237 INFO:teuthology.orchestra.run.target192168000033.stdout:Starting orchestration: ceph.stage.3
2019-02-28T14:19:31.482 INFO:teuthology.orchestra.run.target192168000033.stdout:Parsing orchestration ceph.stage.3 steps... done
2019-02-28T14:19:31.483 INFO:teuthology.orchestra.run.target192168000033.stdout:
2019-02-28T14:19:31.483 INFO:teuthology.orchestra.run.target192168000033.stdout:Stage initialization output:
2019-02-28T14:19:31.483 INFO:teuthology.orchestra.run.target192168000033.stdout:firewall                 : [1m[92mnot installed[0m
2019-02-28T14:19:31.483 INFO:teuthology.orchestra.run.target192168000033.stdout:apparmor                 : [1m[92mdisabled[0m
2019-02-28T14:19:31.484 INFO:teuthology.orchestra.run.target192168000033.stdout:DEV_ENV                  : [1m[92mTrue[0m
2019-02-28T14:19:31.485 INFO:teuthology.orchestra.run.target192168000033.stdout:fsid                     : [1m[92mvalid[0m
2019-02-28T14:19:31.485 INFO:teuthology.orchestra.run.target192168000033.stdout:public_network           : [1m[92mvalid[0m
2019-02-28T14:19:31.485 INFO:teuthology.orchestra.run.target192168000033.stdout:cluster_network          : [1m[92mvalid[0m
2019-02-28T14:19:31.485 INFO:teuthology.orchestra.run.target192168000033.stdout:cluster_interface        : [1m[92mvalid[0m
2019-02-28T14:19:31.486 INFO:teuthology.orchestra.run.target192168000033.stdout:monitors                 : [1m[92mvalid[0m
2019-02-28T14:19:31.486 INFO:teuthology.orchestra.run.target192168000033.stdout:mgrs                     : [1m[92mvalid[0m
2019-02-28T14:19:31.486 INFO:teuthology.orchestra.run.target192168000033.stdout:storage                  : [1m[92mvalid[0m
2019-02-28T14:19:31.486 INFO:teuthology.orchestra.run.target192168000033.stdout:rgw                      : [1m[92mvalid[0m
2019-02-28T14:19:31.486 INFO:teuthology.orchestra.run.target192168000033.stdout:ganesha                  : [1m[92mvalid[0m
2019-02-28T14:19:31.487 INFO:teuthology.orchestra.run.target192168000033.stdout:master_role              : [1m[92mvalid[0m
2019-02-28T14:19:31.488 INFO:teuthology.orchestra.run.target192168000033.stdout:time_server              : [1m[92mvalid[0m
2019-02-28T14:19:31.488 INFO:teuthology.orchestra.run.target192168000033.stdout:fqdn                     : [1m[92mvalid[0m
2019-02-28T14:19:31.489 INFO:teuthology.orchestra.run.target192168000033.stdout:No minions matched the target. No command was sent, no jid was assigned.
2019-02-28T14:19:31.489 INFO:teuthology.orchestra.run.target192168000033.stdout:
2019-02-28T14:19:33.327 INFO:teuthology.orchestra.run.target192168000033.stdout:[init] Executing runner select.minions... ok
2019-02-28T14:19:34.004 INFO:teuthology.orchestra.run.target192168000033.stdout:[init] Executing runner ready.check...
2019-02-28T14:19:34.014 INFO:teuthology.orchestra.run.target192168000033.stdout:         |_  cmd.run_all(/usr/sbin/iptables -S)...
2019-02-28T14:19:34.077 INFO:teuthology.orchestra.run.target192168000033.stdout:               in target192168000033.teuthology... fail
2019-02-28T14:19:34.190 INFO:teuthology.orchestra.run.target192168000033.stdout:         |_  cmd.shell(/usr/sbin/aa-status --enabled 2>/dev/null; echo $?)...
2019-02-28T14:19:34.272 INFO:teuthology.orchestra.run.target192168000033.stdout:               in target192168000033.teuthology... ok
2019-02-28T14:19:34.386 INFO:teuthology.orchestra.run.target192168000033.stdout:         |_  state.apply(ceph.apparmor.profiles, __kwarg__=True, test=true)...
2019-02-28T14:19:34.933 INFO:teuthology.orchestra.run.target192168000033.stdout:               in target192168000033.teuthology... ok
2019-02-28T14:19:35.041 INFO:teuthology.orchestra.run.target192168000033.stdout:[init] Executing runner ready.check... ok
2019-02-28T14:19:39.010 INFO:teuthology.orchestra.run.target192168000033.stdout:[init] Executing runner changed.mon... ok
2019-02-28T14:19:39.887 INFO:teuthology.orchestra.run.target192168000033.stdout:[init] Executing runner changed.osd... ok
2019-02-28T14:19:40.716 INFO:teuthology.orchestra.run.target192168000033.stdout:[init] Executing runner changed.mgr... ok
2019-02-28T14:19:41.836 INFO:teuthology.orchestra.run.target192168000033.stdout:[init] Executing runner changed.config... ok
2019-02-28T14:19:42.696 INFO:teuthology.orchestra.run.target192168000033.stdout:[init] Executing runner changed.client... ok
2019-02-28T14:19:43.598 INFO:teuthology.orchestra.run.target192168000033.stdout:[init] Executing runner changed.global... ok
2019-02-28T14:19:44.471 INFO:teuthology.orchestra.run.target192168000033.stdout:[init] Executing runner changed.mds... ok
2019-02-28T14:19:45.317 INFO:teuthology.orchestra.run.target192168000033.stdout:[init] Executing runner changed.igw... ok
2019-02-28T14:19:46.176 INFO:teuthology.orchestra.run.target192168000033.stdout:[init] Executing runner select.minions... ok
2019-02-28T14:19:47.067 INFO:teuthology.orchestra.run.target192168000033.stdout:[init] Executing runner select.minions... ok
2019-02-28T14:19:47.950 INFO:teuthology.orchestra.run.target192168000033.stdout:[init] Executing runner select.minions... ok
2019-02-28T14:19:48.057 INFO:teuthology.orchestra.run.target192168000033.stdout:
2019-02-28T14:19:48.058 INFO:teuthology.orchestra.run.target192168000033.stdout:[1/29] Executing state ceph.time...
2019-02-28T14:19:48.661 INFO:teuthology.orchestra.run.target192168000033.stdout:         target192168000033.teuthology... ok
2019-02-28T14:19:48.843 INFO:teuthology.orchestra.run.target192168000033.stdout:[2/29] Executing state ceph.packages...
2019-02-28T14:20:45.096 INFO:teuthology.orchestra.run.target192168000033.stdout:Stage execution failed:
2019-02-28T14:20:45.096 INFO:teuthology.orchestra.run.target192168000033.stdout:  - packages
2019-02-28T14:20:46.627 ERROR:tasks.deepsea.orch:orchestration ceph.stage.3 failed. Here comes journalctl!

The last item in the zypper log is the output from the '/usr/bin/zypper' '--non-interactive' '--no-refresh' 'install' '--auto-agree-with-licenses' '--name' 'ceph' command which is run by ceph.packages. The last thing this command does is run the %posttrans script for lvm2, which runs dracut to generate a new initrd. Not sure if this is relevant.

Full logs available at http://10.86.0.53:8081/ubuntu-2019-02-28_14:06:27-suse-ses6---basic-openstack/9963/

Examination of the journalctl output just shows that the zypper install ceph seems to complete but nothing else happens because Stage 3 aborts at that point.

So far I have not seen this failure mode in a non-CLI test.

Setup

(Please provide relevant configs and/or SLS files (Be sure to remove sensitive info).)

Steps to Reproduce Issue

(Include debug logs if possible and relevant.)

Versions Report

(Provided by running: salt-run deepsea.version rpm -qi salt-minion rpm -qi salt-master )

Feb 28 '19 15:02 smithfarm

DeepSea DeepSea copied to clipboard

[CI] Stage 3 fails transiently at ceph.packages

Description of Issue/Question

Setup

Steps to Reproduce Issue

Versions Report

DeepSea
DeepSea copied to clipboard