serverless-registry icon indicating copy to clipboard operation
serverless-registry copied to clipboard

Garbage collector

Open IvanDev opened this issue 10 months ago • 7 comments

This PR introduces a garbage collector. When we remove an image or tag from the repository, blobs referenced by deleted manifests are not removed from R2. With this new PR, we'll schedule garbage collecting after any modifying operations. GC will wait 10 minutes after any modifications are done to the repository this ensures we'll not start garbage collecting of freshly uploaded blobs without parenting manifest.

We have 2 modes for the garbage collector, unreferenced and untagged:

  1. Unreferenced will delete all blobs that are not referenced by any manifest.
  2. Untagged will delete all blobs that are not referenced by any manifest and are not tagged.

Users can skip the GARBAGE_COLLECTOR_MODE variable which will disable GC.

Some considerations:

  • PR utilizes new RPC, so we have updated compatibility_date
  • At this moment Set is used for keeping track of blobs, we can run out of memory and should use the bloom filter instead, which means we'll have a new dependency.
  • GC doesn't remove dummy manifests that reference non-existent blobs.

IvanDev avatar Apr 14 '24 18:04 IvanDev