docker-registry
docker-registry copied to clipboard
Implement a script to garbage collect orphan layers
It would be nice to have a script that uses the storage lib to iterate the dataset in order to find orphans and clean them.
It's important to have a simulate mode that don't delete anything and display the amount of data saved.
Is there currently some work on that, it would be really useful :+1:
Same here, anyone have a work in progress on that?
Instead of periodically checking for orphans, I'd expect it would be easier to refcount images and remove them when their reference count drops to zero. You'd need some temporary code to fill in the initial refcounts for folks with existing storage instances that haven't been refcounting, but you could remove that migration code after a suitable transition period.
I agree refcounting is better. I was about to do a small script to check our prod dataset and estimate the size of orphans at least. But no work in progress right now on the opensource side.
On Thu, Mar 27, 2014 at 10:02:47AM -0700, Sam Alba wrote:
I agree refcounting is better.
I think this overlaps with #7. I think we should consolidate into a single “remove refcounted images” issue, and don't mind if it's this one or #7.
In case you come here looking for a script to clean up your private repository right now, here's the script that we have to look for unused images and report how much space is taken.
It leaves a file in /tmp with all the unused images. You can use that to perform the deletion of images. I didn't want to automatically delete things, so it should be safe to run. :-) Caveat Emptor, and all that.
@shepmaster thanks for the script! Did you have any luck actually removing images deemed unused by your script? I just ran it and I am planning to remove the orphan images, but I am guessing _index_images also needs to reflect the deletions?
@bjaglin I never actually deleted all of the images reported. I modified the script to just focus on a single repository and moved those images to another directory for a while (as good as deleted, but I could restore if something went horribly wrong). That worked fine. I also cleaned up _index_images as described in #7
For the record, to reclaim space I am currently experimenting with https://gist.github.com/bjaglin/1ff66c20c4bc4d9de522 which:
- uses a @shepmaster's script to identify orphan images
- remove references to these images in the
_index_imagesrepo indices - remove the actual images on disk
DISCLAIMER: Use it at your own risk though, as it might break some invariants otherwise enforced by the registry, and is heavily dependent on the implementation details of the registry (version 0.7.0 at the time of writing).
+1