securedrop-workstation icon indicating copy to clipboard operation
securedrop-workstation copied to clipboard

Installation fails due to lingering disposable VMs

Open kushaldas opened this issue 4 years ago • 4 comments

I tried to reboot many times to get pass this step while doing make all, and even after make clean

----------
          ID: sd-default-mgmt-dvm-fedora-version-halt-wait
    Function: cmd.run
        Name: sleep 5
      Result: True
     Comment: Command "sleep 5" run
     Started: 11:31:40.601024
    Duration: 5033.73 ms
     Changes:   
              ----------
              pid:
                  24254
              retcode:
                  0
              stderr:
              stdout:
----------
          ID: sd-default-mgmt-dvm-fedora-version-update
    Function: qvm.vm
        Name: default-mgmt-dvm
      Result: False
     Comment: An exception occurred in this state: Traceback (most recent call last):
                File "/usr/lib/python2.7/site-packages/salt/state.py", line 1837, in call
                  **cdata['kwargs'])
                File "/usr/lib/python2.7/site-packages/salt/loader.py", line 1794, in wrapper
                  return f(*args, **kwargs)
                File "/var/cache/salt/minion/extmods/states/ext_state_qvm.py", line 434, in vm
                  status = globals()[action](name, *_varargs, **keywords)
                File "/var/cache/salt/minion/extmods/states/ext_state_qvm.py", line 300, in prefs
                  return _state_action('qvm.prefs', name, *varargs, **kwargs)
                File "/var/cache/salt/minion/extmods/states/ext_state_qvm.py", line 144, in _state_action
                  status = __salt__[_action](*varargs, **kwargs)
                File "/var/cache/salt/minion/extmods/modules/ext_module_qvm.py", line 933, in prefs
                  setattr(args.vm, dest, value_new)
                File "/usr/lib/python2.7/site-packages/qubesadmin/base.py", line 281, in __setattr__
                  str(value).encode('utf-8'))
                File "/usr/lib/python2.7/site-packages/qubesadmin/base.py", line 68, in qubesd_call
                  payload_stream)
                File "/usr/lib/python2.7/site-packages/qubesadmin/app.py", line 577, in qubesd_call
                  return self._parse_qubesd_response(return_data)
                File "/usr/lib/python2.7/site-packages/qubesadmin/base.py", line 102, in _parse_qubesd_response
                  raise exc_class(format_string, *args)
              QubesVMInUseError: Cannot change template while there are DispVMs based on this qube
     Started: 11:31:45.636932
    Duration: 94.978 ms
     Changes:   
----------
          ID: sd-default-mgmt-dvm-fedora-version-start
    Function: qvm.start
        Name: default-mgmt-dvm
      Result: False
     Comment: One or more requisite failed: sd-sys-vms.sd-default-mgmt-dvm-fedora-version-update
     Changes:   

Summary for local
-------------
Succeeded: 19 (changed=9)
Failed:     2

Finally I just gave up, removed those two lines from the .sls file and finished make all.

kushaldas avatar May 28 '20 14:05 kushaldas

Summarizing discussion in standup today, @emkll & @kushaldas discussed this out of band. The root cause appears to be lingering management VMs, likely from a cancelled or otherwise interrupted provisioning run. @kushaldas confirmed that lingering management VMs were present on his system, and will report back with results to confirm or deny whether manually cleaning up those lingering VMs resolves the issue (we suspect it will).

conorsch avatar May 28 '20 16:05 conorsch

@emkll as you suggested, if I delete any vm starting with disp in the name, then the process continues properly. I think we should add this part of the salt run.

kushaldas avatar May 29 '20 14:05 kushaldas

Retitled for clarity. Still seems like a worthwhile potential improvement, but unlikely to impact production installs that follow the docs, so low priority.

eloquence avatar Feb 09 '21 20:02 eloquence

from backlog pruning:

  • need to see if this is still true
  • this may change with some of our provisioning changes
  • could be user-hostile if system gets into a totally unusable state, so we should document recovery steps at minimum (reboot?), if this is still true

rocodes avatar Mar 21 '24 17:03 rocodes