gotosocial icon indicating copy to clipboard operation
gotosocial copied to clipboard

[feature] Clean up orphaned media from storage directory

Open tsmethurst opened this issue 2 years ago • 3 comments

Right now I'm in a situation on goblin technology (which is admittedly completely my own fault), where I did a database export/import without deleting my storage folder in between. Now, I've got a storage folder that's full of a combination of attachments with database entries (no problem, they can be cleaned), and attachments with no database entries. This latter is a problem because the cleaning logic has no way of removing them. It might not be a bad idea to implement some kind of storage cleaning thingy that can remove orphaned media from inside the storage folders.

tsmethurst avatar Sep 28 '22 19:09 tsmethurst

KVStore has an iterator logic that should allow you to step through all the "keys" in the media folder, and for each you can check for a DB entry and build a list of keys that need to be removed.

NyaaaWhatsUpDoc avatar Sep 28 '22 19:09 NyaaaWhatsUpDoc

hell yeah, nice!

in the meantime I did this very hacky command to remove files older than 5 days, but I don't recommend anyone else reading this to do this:

for f in $(find storage/* -mtime +5 | grep -v certs | grep -v store\.lock | egrep 'storage/.*/.*/.*/.*\..*'); do rm $f; done

tsmethurst avatar Sep 28 '22 19:09 tsmethurst

This might be easier as well if we pull in v2 of go-store which now implements S3 support, so we can use the Iterate() function on either storage backend. S3 storage backend code is here for anyone interested: https://codeberg.org/gruf/go-store/src/branch/main/storage/s3.go

NyaaaWhatsUpDoc avatar Sep 29 '22 08:09 NyaaaWhatsUpDoc

Just a quick drive-by comment. My "go" skills aren't good enough to tell by perusing the code (I did have a look), but does removing the last file in a directory cause the directory to be removed (and so on up the tree)? With directories taking 4Kb on ext4 (for example), a few tens of thousands of empty directories starts to add up.

gw1urf avatar Nov 25 '22 17:11 gw1urf

We did previously support cleaning up directories in this manner (as the underlying storage.DiskStorage{} has a .Clean() functions which does this), but it is temporarily disabled as other users were having issues putting their storage directory on the root of their drive (and so perms issues trying to wipelost+found lol).

NyaaaWhatsUpDoc avatar Nov 25 '22 17:11 NyaaaWhatsUpDoc