securedrop-workstation icon indicating copy to clipboard operation
securedrop-workstation copied to clipboard

Apt repo config via salt vs qubes-update-check.timer: it's a race!

Open rocodes opened this issue 1 year ago • 2 comments

  • [x] I have searched for duplicates or related issues

Description

Noticed during a CI run that sometimes the apt-repo salt config fails because another process is using apt, namely, the qubes-update-check, which runs 5 minutes after boot, causing our provisioning to fail. h/t @legoktm for diagnosis.

Mostly filing so that if other people run into this issue they'll know what's up. I think these conditions are unlikely in real-world provisioning, but not impossible.

Example of failed run
[.snip]
INFO:[2024-06-14-15:54:15:653244]       Function: cmd.run
INFO:[2024-06-14-15:54:15:653271]           Name: apt-get update --allow-releaseinfo-change
INFO:[2024-06-14-15:54:15:653296]         Result: False
INFO:[2024-06-14-15:54:15:653320]        Comment: Command "apt-get update --allow-releaseinfo-change" run
INFO:[2024-06-14-15:54:15:653348]        Started: 15:51:42.318709
INFO:[2024-06-14-15:54:15:653375]       Duration: 888.403 ms
INFO:[2024-06-14-15:54:15:653399]        Changes:   
INFO:[2024-06-14-15:54:15:653436]                 ----------
INFO:[2024-06-14-15:54:15:653461]                 pid:
INFO:[2024-06-14-15:54:15:653489]                     6548
INFO:[2024-06-14-15:54:15:653514]                 retcode:
INFO:[2024-06-14-15:54:15:653546]                     100
INFO:[2024-06-14-15:54:15:653578]                 stderr:
INFO:[2024-06-14-15:54:15:653603]                     E: Could not get lock /var/lib/apt/lists/lock. It is held by process 6257 (apt-get)
INFO:[2024-06-14-15:54:15:653630]                     E: Unable to lock directory /var/lib/apt/lists/
INFO:[2024-06-14-15:54:15:653658]                 stdout:
INFO:[2024-06-14-15:54:15:653682]                     Reading package lists...
INFO:[2024-06-14-15:54:15:653710]   ----------
INFO:[2024-06-14-15:54:15:653734]             ID: autoremove-old-packages
INFO:[2024-06-14-15:54:15:653762]       Function: cmd.run
INFO:[2024-06-14-15:54:15:653790]           Name: apt-get autoremove -y
INFO:[2024-06-14-15:54:15:653814]         Result: False
INFO:[2024-06-14-15:54:15:653841]        Comment: One or more requisite failed: securedrop_salt.fpf-apt-repo.update-apt-cache-with-stable-change
INFO:[2024-06-14-15:54:15:653870]        Started: 15:51:43.207424
INFO:[2024-06-14-15:54:15:653894]       Duration: 0.002 ms
INFO:[2024-06-14-15:54:15:653919]        Changes:   
INFO:[2024-06-14-15:54:15:653947]   ----------
INFO:[2024-06-14-15:54:15:653974]             ID: configure-fpf-apt-repo
INFO:[2024-06-14-15:54:15:653998]       Function: file.managed
INFO:[2024-06-14-15:54:15:654022]           Name: /etc/apt/sources.list.d/apt-test_freedom_press.sources
INFO:[2024-06-14-15:54:15:654047]         Result: False
INFO:[2024-06-14-15:54:15:654074]        Comment: One or more requisite failed: securedrop_salt.fpf-apt-repo.autoremove-old-packages
INFO:[2024-06-14-15:54:15:654099]        Started: 15:51:43.248910
INFO:[2024-06-14-15:54:15:654126]       Duration: 0.009 ms
INFO:[2024-06-14-15:54:15:654153]        Changes:   
INFO:[2024-06-14-15:54:15:654179]   ----------
INFO:[2024-06-14-15:54:15:654206]             ID: upgrade-all-packages
INFO:[2024-06-14-15:54:15:654231]       Function: pkg.uptodate
INFO:[2024-06-14-15:54:15:654258]         Result: False
INFO:[2024-06-14-15:54:15:654285]        Comment: One or more requisite failed: securedrop_salt.fpf-apt-repo.configure-fpf-apt-repo, securedrop_salt.fpf-apt-repo.update-apt-cache-with-stable-change
INFO:[2024-06-14-15:54:15:654310]        Started: 15:51:47.031360
INFO:[2024-06-14-15:54:15:654337]       Duration: 0.006 ms
INFO:[2024-06-14-15:54:15:654361]        Changes:   
INFO:[2024-06-14-15:54:15:654386]   ----------
INFO:[2024-06-14-15:54:15:654424]             ID: install-securedrop-keyring-package
INFO:[2024-06-14-15:54:15:654453]       Function: pkg.installed
INFO:[2024-06-14-15:54:15:654479]         Result: False
INFO:[2024-06-14-15:54:15:654507]        Comment: One or more requisite failed: securedrop_salt.fpf-apt-repo.configure-fpf-apt-repo
INFO:[2024-06-14-15:54:15:654532]        Started: 15:51:47.037090
INFO:[2024-06-14-15:54:15:654560]       Duration: 0.004 ms
INFO:[2024-06-14-15:54:15:654588]        Changes:   
INFO:[2024-06-14-15:54:15:654612]   ----------
INFO:[2024-06-14-15:54:15:654640]             ID: install-securedrop-log-package
INFO:[2024-06-14-15:54:15:654668]       Function: pkg.installed
INFO:[2024-06-14-15:54:15:654692]         Result: False
INFO:[2024-06-14-15:54:15:654719]        Comment: One or more requisite failed: securedrop_salt.fpf-apt-repo.configure-fpf-apt-repo, securedrop_salt.fpf-apt-repo.upgrade-all-packages, securedrop_salt.fpf-apt-repo.update-apt-cache-with-stable-change, securedrop_salt.fpf-apt-repo.autoremove-old-packages, securedrop_salt.fpf-apt-repo.install-securedrop-keyring-package
INFO:[2024-06-14-15:54:15:654747]        Started: 15:51:47.037291
INFO:[2024-06-14-15:54:15:654771]       Duration: 0.002 ms
INFO:[2024-06-14-15:54:15:654798]        Changes:   
INFO:[2024-06-14-15:54:15:654822]   ----------

Steps to Reproduce

Luck of the draw- be at the fpf_apt_repo parts of a provisioning or migration/update run 5 min after boot.

Comments

  • I did notice we run the apt repo config in each VM. We could do it in the template instead before we clone it. That would save some time provisioning as well as reduce the number of times we're running apt commands during the salt run. We should probably do this (and also see if there's anything else we can refactor along similar lines).
  • I thought about but ruled out the idea of disabling then reenabling qubes-update-check.timer during provisioning--gives me the fear

rocodes avatar Jun 14 '24 19:06 rocodes

I thought about but ruled out the idea of disabling then reenabling qubes-update-check.timer during provisioning--gives me the fear

We could disable the Qubes service that corresponds to qubes-update-check as one of the first things we do, since we have our mandatory update policy.

legoktm avatar Jun 14 '24 21:06 legoktm

I guess the thing I was worried about was if the install failed and we somehow left users without the qubes-update-check; this seems more bossy than even our usual bossiness w.r.t the workstation. (I'm worried about the edge cases moreso than the straightforward case). But I'm not dead set against it, I just want to think about it in case there's a less invasive way.

rocodes avatar Jun 14 '24 22:06 rocodes