qubes-issues icon indicating copy to clipboard operation
qubes-issues copied to clipboard

Automatically remove old, unneeded update files from TemplateVMs to save disk space

Open ninavizz opened this issue 4 years ago • 21 comments

The problem you're addressing (if any) Diskspace inflation in templates from old installer files, has been an ongoing problem in the SecureDrop Workstation. While Marek has been lovely and attentive as always in commenting on our issues, and we're likely to implement our own workaround shortly, this issue is to implement a fix for Qubes users not using our specific deployment of it—who may likely encounter the same problem.

Describe the solution you'd like From @marmarek in SDW #653

I think it should be easy enough to add autoremove call to update.qubes-vm salt file. I don't see proper abstraction for it in the pkg state, so it will likely need cmd.run state.

See also https://github.com/QubesOS/qubes-issues/issues/6676#issuecomment-2082475532

Where is the value to a user, and who might that user be? Savvy users know to check for bloat from accumulated installer files that a young product (Qubes) may not yet have an automation in place to clean-up. Less savvy users (me!), not so much. Always cheering for the less savvy users, I am.

Describe alternatives you've considered

  • Tears.
  • Improved messaging that points to installation failures from disk space problems (so, #6590).

Related, non-duplicate issues Maybe #1639?

ninavizz avatar Jun 08 '21 00:06 ninavizz

(@ninavizz, the original title made it sound like this was about the Qubes OS installer, but reading the description made it clear that it's about disk usage from performing routine updates inside of TemplateVMs, so I've updated the title in a way that hopefully makes it more intuitive to those who are familiar with this topic.)

andrewdavidwong avatar Jun 08 '21 01:06 andrewdavidwong

I'm guessing this would only be implemented for officially-supported templates, so I've added the Debian and Fedora labels, rather than just C: templates.

Come to think of it, I'm not sure if anything special needs to be done for Fedora templates, unlike Debian, which requires something like autoremove, as already noted in https://github.com/freedomofpress/securedrop-workstation/issues/653. The funny thing is, dnf also has dnf autoremove, but I never remember using it or hearing anyone talk about it, whereas people seem to use and talk about apt[-get] autoremove all the time.

andrewdavidwong avatar Jun 08 '21 01:06 andrewdavidwong

On Mon, Jun 07, 2021 at 06:13:41PM -0700, Andrew David Wong wrote:

I'm guessing this would only be implemented for officially-supported templates, so I've added the Debian and Fedora labels (rather than just C: templates.

Come to think of it, I'm not sure if anything special needs to be done for Fedora templates, unlike Debian, which requires something like autoremove, as already noted in https://github.com/freedomofpress/securedrop-workstation/issues/653. The funny thing is, dnf also has dnf autoremove, but I never remember using it or hearing anyone talk about it, whereas people seem to use and talk about apt[-get] autoremove all the time.

For Debian, this is duplicate issue, and has already been resolved in 4.1. #5266 closed

unman avatar Jun 08 '21 11:06 unman

This appears to be a duplicate of an existing issue. If so, please comment on the appropriate existing issue instead. If you believe this is not really a duplicate, please leave a comment briefly explaining why. We'll be happy to take another look and, if appropriate, reopen this issue. Thank you.

andrewdavidwong avatar Jun 08 '21 20:06 andrewdavidwong

Closed as duplicate on the assumption that the Debian fix is all that's desired. If there's demand for more, please comment.

andrewdavidwong avatar Jun 08 '21 20:06 andrewdavidwong

@unman @andrewdavidwong

In our experience, hitting diskspace limits is the most common cause of update breakage among the ~10 users we are currently supporting.

It looks like https://github.com/QubesOS/qubes-core-agent-linux/pull/218 will make it unnecessary to run sudo apt clean in Debian templates, but is that change fully sufficient to autoremove packages that are no longer needed as part of an update run? As an example, in a template I just cleaned up, 200 MB were cleaned up via sudo apt clean (which I understand will no longer be needed), and 800 MB were cleaned up via sudo apt autoremove.

eloquence avatar Jul 22 '21 19:07 eloquence

@eloquence:

It looks like QubesOS/qubes-core-agent-linux#218 will make it unnecessary to run sudo apt clean in Debian templates, but is that change fully sufficient to autoremove packages that are no longer needed as part of an update run? As an example, in a template I just cleaned up, 200 MB were cleaned up via sudo apt clean (which I understand will no longer be needed), and 800 MB were cleaned up via sudo apt autoremove.

I had the same thought earlier, which is why I specifically mentioned autoremove in https://github.com/QubesOS/qubes-issues/issues/6676#issuecomment-856364936. Since @unman specifically quoted that message when saying that it's already been resolved (and since he has probably forgotten more about Debian than I'll ever know :slightly_smiling_face:), I figured that this base was covered. However, it's worth waiting to see what @unman says to get explicit confirmation on this point.

andrewdavidwong avatar Jul 24 '21 07:07 andrewdavidwong

On Sat, Jul 24, 2021 at 12:36:27AM -0700, Andrew David Wong wrote:

@eloquence:

It looks like QubesOS/qubes-core-agent-linux#218 will make it unnecessary to run sudo apt clean in Debian templates, but is that change fully sufficient to autoremove packages that are no longer needed as part of an update run? As an example, in a template I just cleaned up, 200 MB were cleaned up via sudo apt clean (which I understand will no longer be needed), and 800 MB were cleaned up via sudo apt autoremove.

I had the same thought earlier, which is why I specifically mentioned autoremove in https://github.com/QubesOS/qubes-issues/issues/6676#issuecomment-856364936. Since @unman specifically quoted that message when saying that it's already been resolved (and since he has probably forgotten more about Debian than I'll ever know :slightly_smiling_face:), I figured that this base was covered. However, it's worth waiting to see what @unman says to get explicit confirmation on this point.

This is rather a different case from the issue.

I would not want to see an automatic "autoremove" , for a simple reason - despite the best efforts of Debian developers, autoremove is prone to suggesting removal of core packages, e.g. removal of X That's why the option presents a list of packages to be removed.

In my experience it's never a good idea to automatically autoremove. If you want to do this for your users you can, of course, do so, but prepare for breakage.

You probably know that if you enable Unattended-Upgrades, then you can configure auto removal: Unattended-Upgrade::Remove-Unused-Dependencies "true"; We don't do this by default.

I would prefer to increase the size of the template system disk, if this were an issue, and educate users in monitoring their own systems. As always I'm prepared to be persuaded, but in this case it would take a lot: the risk/reward balance swings the wrong way.

Or it would be possible to set crucial packages as "manual installed" before an automatic autoremove - all Qubes packages, X, what else? (I don't like this solution.)

unman avatar Jul 27 '21 11:07 unman

I agree automatic apt-get autoremove sounds risky, especially if it wasn't this way before (the change itself may break a lot of stuff, until users gets used to it). But automatic apt-get clean should be safe.

-- Best Regards, Marek Marczykowski-Górecki Invisible Things Lab

marmarek avatar Jul 27 '21 12:07 marmarek

The problem is that automatic clean only saves a fraction of the disk space compared to autoremove, and it's unrealistic to expect most users to manually monitor and micromanage autoremoving for every Debian template. Out of curiosity, why isn't this a problem on Fedora, and why can't core packages be protected from automatic autoremoving?

andrewdavidwong avatar Jul 28 '21 02:07 andrewdavidwong

The problem is that automatic clean only saves a fraction of the disk space compared to autoremove, and it's unrealistic to expect most users to manually monitor and micromanage autoremoving for every Debian template. Out of curiosity, why isn't this a problem on Fedora, and why can't core packages be protected from automatic autoremoving?

On Fedora, automatic removal is the default, so presumably bugs caused by it are found and fixed.

DemiMarie avatar Jul 28 '21 02:07 DemiMarie

For the SecureDrop Workstation right now, the problem is pretty major - I just freed up 1.1GB in a template with another autoremove run. With the default template size it's easy to see how you can quickly run out of space, breaking updates. I don't think it's reasonable to ask ordinary end users to handle such template diskspace cleanup tasks.

In this case I just observed, it was mostly due to Linux kernel packages that are no longer needed; since we ship our own grsec kernel, it's possible that we can do more on our end to enforce removal of older packages.

Alternatively, could it make sense to expose this as an option at the template level, so that e.g. the Salt update logic can query that property, and run autoremove after each update iff it's set? Basically, a way for users and Qubes implementers to say to the updater, "when updating this template, also try to remove unneeded packages automatically".

eloquence avatar Jul 28 '21 03:07 eloquence

On the one hand, I can understand how this is a Debian problem and not a Qubes-specific problem. We're a tiny project compared to Debian, so it wouldn't be realistic for us to try to fix their problems on top of all of our own. On the other hand, it's clearly a Qubes-specific problem if Debian template updates are breaking while baremetal Debian updates (presumably) aren't. So, why do Qubes Debian template updates break when baremetal Debian updates don't? Is it simply a matter of disk space? If so, can we at least increase the default Debian template disk space enough to make update breakage a non-issue? If forced to choose between wasting disk space and updates breaking, I'd choose the former.

andrewdavidwong avatar Jul 28 '21 03:07 andrewdavidwong

On the one hand, I can understand how this is a Debian problem and not a Qubes-specific problem. We're a tiny project compared to Debian, so it wouldn't be realistic for us to try to fix their problems on top of all of our own. On the other hand, it's clearly a Qubes-specific problem if Debian template updates are breaking while baremetal Debian updates (presumably) aren't. So, why do Qubes Debian template updates break when baremetal Debian updates don't? Is it simply a matter of disk space? If so, can we at least increase the default Debian template disk space enough to make update breakage a non-issue? If forced to choose between wasting disk space and updates breaking, I'd choose the former.

Increasing disk space will help, and thanks to thin provisioning it won’t be used unless it is actually needed. That said, we should make sure we have the correct dependencies to keep autoremove from breaking a template completely ― even if, per Debian policy, those dependencies should not be needed.

DemiMarie avatar Jul 28 '21 04:07 DemiMarie

Removing core packages should not be a problem, dependencies are correctly set (otherwise they wouldn't be installed in the first place). I'm more afraid of removing packages that the user uses but were installed only as a side effect initially and not needed this way anymore. This happened to me more than once.

Maybe we can do something about kernel packages specifically? Fedora has a feature specifically to avoid piling up kernel packages (there is a limit of 3 of them at once). Any idea how? The naive apt-get autoremove "linux-image-*" tried to remove all of them, not only unused ones...

marmarek avatar Jul 28 '21 09:07 marmarek

Maybe we can do something about kernel packages specifically? Fedora has a feature specifically to avoid piling up kernel packages (there is a limit of 3 of them at once). Any idea how? The naive apt-get autoremove "linux-image-*" tried to remove all of them, not only unused ones...

That isn’t as bad as it could be, since a user can use a dom0-provided kernel to recover.

In this case I just observed, it was mostly due to Linux kernel packages that are no longer needed; since we ship our own grsec kernel, it's possible that we can do more on our end to enforce removal of older packages.

How do you manage to do this? I thought Open Source Security banned public distribution of its patched kernels (by refusing to provide future updates to anyone who distributes the source).

DemiMarie avatar Jul 28 '21 09:07 DemiMarie

That isn’t as bad as it could be, since a user can use a dom0-provided kernel to recover.

Yes, but also, if you want to use in-vm kernel, removing it at every single update is definitely bad.

marmarek avatar Jul 28 '21 11:07 marmarek

On Tue, Jul 27, 2021 at 09:16:23PM -0700, Demi Marie Obenour wrote:

On the one hand, I can understand how this is a Debian problem and not a Qubes-specific problem. We're a tiny project compared to Debian, so it wouldn't be realistic for us to try to fix their problems on top of all of our own. On the other hand, it's clearly a Qubes-specific problem if Debian template updates are breaking while baremetal Debian updates (presumably) aren't. So, why do Qubes Debian template updates break when baremetal Debian updates don't? Is it simply a matter of disk space? If so, can we at least increase the default Debian template disk space enough to make update breakage a non-issue? If forced to choose between wasting disk space and updates breaking, I'd choose the former.

Increasing disk space will help, and thanks to thin provisioning it won???t be used unless it is actually needed. That said, we should make sure we have the correct dependencies to keep autoremove from breaking a template completely ??? even if, per Debian policy, those dependencies should not be needed.

Increasing disk space could help. The issue isn't whether we have correct dependencies but whether all the packages that are used have the correct dependencies, (and even there autoremove can break a system). We cant rely on users reviewing the proposed removals - we should be able to, but you only have to look at reports from Arch and Bullseye users to see they don't.

I don't see many reports of Qubes users breaking their templates - so either they are able to manage their systems (as most Debian users do), or the current space allocation is adequate. We do recommend using autoremove after a dist-upgrade - there it makes sense. In the ordinary case, I remain convinced that the risk/reward balance is against it.

In the particular case of kernels and headers, it would be possible to whip up a script to identify the latest package, and remove the rest. (dpkg-query with some gawk will do it: we cant use uname because for most users it wont identify in qube kernels)

Or, as I suggested, we could try to identify core packages, mark as manually installed or hold, and then autoremove. But that's incredibly fragile.

unman avatar Jul 28 '21 12:07 unman

As an example, in a template I just cleaned up, 200 MB were cleaned up via sudo apt clean (which I understand will no longer be needed), and 800 MB were cleaned up via sudo apt autoremove.

How much of that were kernel packages, and more interestingly - how many of them you had? @eloquence

marmarek avatar Aug 01 '21 12:08 marmarek

One more data point: For me this is also a very annoying problem, it causes update failures with obscure error messages (in Qubes 4.0; I get errors like "error 100" or "PGP signature verification failed"), and it happens rarely enough for me to forget what it was about between occurrences :) Also, when I hit this today, it was bad enough to prevent even xterm from starting in the template VM. I don't imagine how non-power-users could resolve this issue by themselves :/

How much of that were kernel packages, and more interestingly - how many of them you had?

For me it was ~2 GB and ~5 kernel packages (disclaimer: writing from memory).

mkow avatar Jun 13 '22 12:06 mkow

It looks like apt dist-upgrade in bookworm does remove unused kernels automatically, but it isn't full autoremove so the remark in https://github.com/QubesOS/qubes-issues/issues/6676#issuecomment-887466306 doesn't apply. On the other hand, apt-get dist-upgrade doesn't do that. May be worth looking how apt does that, and do similar thing in qubes-vm-update.

marmarek avatar Apr 29 '24 11:04 marmarek