lima icon indicating copy to clipboard operation
lima copied to clipboard

Make cache pruning more flexible

Open jay7x opened this issue 2 years ago • 15 comments

This command deletes only those cached downloads which are not referenced by any existing VM configs.

jay7x avatar Mar 08 '23 18:03 jay7x

Should we maybe keep also the images that are included in the current template versions, not only instances ?

afbjorklund avatar Mar 08 '23 19:03 afbjorklund

Should we maybe keep also the images that are included in the current template versions, not only instances ?

I have no strong opinion on this.. my case is to cleanup images I've downloaded for some tests and which are not used (doesn't exists) anymore.. Though I'd like to keep the images I'm still using as internet in some countries (Indonesia e.g.) is not that great to redownload few gigabytes frequently.

jay7x avatar Mar 08 '23 19:03 jay7x

I added an issue (#1410) for the discussion - there's also Slack, if we want to keep this PR for the implementation review

afbjorklund avatar Mar 08 '23 19:03 afbjorklund

I've simplified things and removed the prune subcommands. New CLI switches added instead:

  -F, --fallback       Only prune images without a digest specified (fallback images usually)
  -U, --unreferenced   Only prune downloads not referenced in any VM or template

jay7x avatar Mar 19 '23 13:03 jay7x

Updates:

  1. Downloads referenced by a template are dropped from the PR scope
  2. Changes are rebased on the current master branch
  3. Commits are squashed
  4. Few small improvements (better logging, use digest instead of sprintf()'ed sha256)

jay7x avatar May 13 '23 11:05 jay7x

I think this PR is good, but have run out of time right now (will finish tomorrow).

I found it useful to add a --dry-run option for testing, and I think it may be generally useful, so please consider adding it (but can also be done later in another PR):

--- cmd/limactl/prune.go
+++ cmd/limactl/prune.go
@@ -21,12 +21,17 @@ func newPruneCommand() *cobra.Command {
                ValidArgsFunction: cobra.NoFileCompletions,
        }

+       pruneCommand.Flags().Bool("dry-run", false, "Show which images would be deleted, but don't really delete them")
        pruneCommand.Flags().Bool("no-digest-only", false, "Only prune images without a digest specified (\"fallback\" images usually)")
        pruneCommand.Flags().Bool("unreferenced-only", false, "Only prune downloads not referenced in any VM")
        return pruneCommand
 }

 func pruneAction(cmd *cobra.Command, args []string) error {
+       dryRun, err := cmd.Flags().GetBool("dry-run")
+       if err != nil {
+               return err
+       }
        pruneWithoutDigest, err := cmd.Flags().GetBool("no-digest-only")
        if err != nil {
                return err
@@ -68,20 +73,24 @@ func pruneAction(cmd *cobra.Command, args []string) error {
                                if pruneWithoutDigest && refFile.Digest == "" {
                                        // delete the fallback image entry (entry w/o digest) even if referenced
                                        logrus.WithFields(entryFields).Infof("Deleting fallback entry")
-                                       if err := os.RemoveAll(entry.Path); err != nil {
-                                               logrus.WithError(err).WithFields(logrus.Fields{
-                                                       "path": entry.Path,
-                                               }).Errorf("Cannot delete directory. Skipping...")
+                                       if !dryRun {
+                                               if err := os.RemoveAll(entry.Path); err != nil {
+                                                       logrus.WithError(err).WithFields(logrus.Fields{
+                                                               "path": entry.Path,
+                                                       }).Errorf("Cannot delete directory. Skipping...")
+                                               }
                                        }
                                }
                        } else {
                                if pruneUnreferenced {
                                        // delete the unreferenced cached entry
                                        logrus.WithFields(entryFields).Infof("Deleting unreferenced entry")
-                                       if err := os.RemoveAll(entry.Path); err != nil {
-                                               logrus.WithError(err).WithFields(logrus.Fields{
-                                                       "path": entry.Path,
-                                               }).Errorf("Cannot delete directory. Skipping...")
+                                       if !dryRun {
+                                               if err := os.RemoveAll(entry.Path); err != nil {
+                                                       logrus.WithError(err).WithFields(logrus.Fields{
+                                                               "path": entry.Path,
+                                                       }).Errorf("Cannot delete directory. Skipping...")
+                                               }
                                        }
                                }
                        }
@@ -96,6 +105,9 @@ func pruneAction(cmd *cobra.Command, args []string) error {
        }
        cacheDir := filepath.Join(ucd, "lima")
        logrus.Infof("Pruning everything in %q", cacheDir)
+       if dryRun {
+               return nil
+       }
        return os.RemoveAll(cacheDir)
 }

jandubois avatar May 24 '23 07:05 jandubois

Shall we merge this and discuss dry-run in another issue/PR?

AkihiroSuda avatar May 25 '23 06:05 AkihiroSuda

If there is no release expected in 2 next days I'd like to add dry run because I like the idea. I'll add it tomorrow.

jay7x avatar May 25 '23 08:05 jay7x

Just for illustration, assuming my cache looks like this:

653M	https://cloud-images.ubuntu.com/releases/22.04/release-20230302/ubuntu-22.04-server-cloudimg-amd64.img
654M	https://cloud-images.ubuntu.com/releases/22.04/release-20230518/ubuntu-22.04-server-cloudimg-amd64.img
727M	https://cloud-images.ubuntu.com/releases/22.10/release-20230413/ubuntu-22.10-server-cloudimg-amd64.img
779M	https://cloud-images.ubuntu.com/releases/23.04/release-20230502/ubuntu-23.04-server-cloudimg-amd64.img
233M	https://github.com/containerd/nerdctl/releases/download/v1.3.0/nerdctl-full-1.3.0-linux-amd64.tar.gz
233M	https://github.com/containerd/nerdctl/releases/download/v1.3.1/nerdctl-full-1.3.1-linux-amd64.tar.gz
238M	https://github.com/containerd/nerdctl/releases/download/v1.4.0/nerdctl-full-1.4.0-linux-amd64.tar.gz
 66M	https://github.com/lima-vm/alpine-lima/releases/download/v0.2.28/alpine-lima-std-3.18.0-x86_64.iso
 66M	https://github.com/lima-vm/alpine-lima/releases/download/v0.2.30/alpine-lima-std-3.18.0-x86_64.iso

I would expect limactl prune --unreferenced-only to leave these files behind, even if I delete all my current instances:

727M	https://cloud-images.ubuntu.com/releases/22.10/release-20230413/ubuntu-22.10-server-cloudimg-amd64.img
238M	https://github.com/containerd/nerdctl/releases/download/v1.4.0/nerdctl-full-1.4.0-linux-amd64.tar.gz
 66M	https://github.com/lima-vm/alpine-lima/releases/download/v0.2.30/alpine-lima-std-3.18.0-x86_64.iso

I guess for backwards compatibility we need to keep the existing behaviour, but I feel like we should require an --all option to wipe the whole cache, and have something safer, like --unreferenced-only the default.

jandubois avatar May 26 '23 05:05 jandubois

I'm running out of time again, but I also think the --no-digest-only implementation doesn't work:

$ ls -l ~/Library/Caches/lima/download/by-url-sha256/791040918a29bb23fd0bc4bd074e32891299ffcdc0d0d20cd76a4af9593b0fa4
total 1503112
-rw-r--r--  1 jan  staff  769589248 25 May 22:59 data
-rw-r--r--  1 jan  staff         93 25 May 22:57 url
$ limactl ls
WARN[0000] No instance found. Run `limactl start` to create an instance.
$ limactl prune --dry-run --no-digest-only
[no output at all]

Maybe I made a mistake, will need to take a look again on the weekend.

jandubois avatar May 26 '23 06:05 jandubois

Let me move this to v0.17

AkihiroSuda avatar May 26 '23 06:05 AkihiroSuda

  • Downloads referenced by a template are dropped from the PR scope

I somehow missed this. What is the reason for that?

The reason was to remove the point of the conflict and merge things which everyone is ok with (except @AkihiroSuda maybe 😄 )

I just noticed that when I deleted all my current instances, then limactl prune --unreferenced-only will delete everything in the cache (including the current nerdctl-full archives).

This is expected as there is nothing referencing the downloads. Which is the default behavior of lima prune before this PR.

jay7x avatar May 26 '23 13:05 jay7x

I'm running out of time again, but I also think the --no-digest-only implementation doesn't work:

Well.. you're right :( I just found it's broken.. I've over-optimized things a bit.. let me fix it

jay7x avatar May 26 '23 18:05 jay7x

I'm running out of time again, but I also think the --no-digest-only implementation doesn't work:

Well.. you're right :( I just found it's broken.. I've over-optimized things a bit.. let me fix it

Do you plan to update this PR?

AkihiroSuda avatar Jun 23 '23 04:06 AkihiroSuda

I'm running out of time again, but I also think the --no-digest-only implementation doesn't work:

Well.. you're right :( I just found it's broken.. I've over-optimized things a bit.. let me fix it

Do you plan to update this PR?

Yes, I'll continue to work on this later. @jandubois created pretty good definition of the process in #1409. My plan is to follow it. Though this requires some thinking.. Let me change the PR to draft again.

jay7x avatar Jun 23 '23 06:06 jay7x