Tagsistant
Tagsistant copied to clipboard
[question] Is there any documentation on how the `export` directory works?
It seems like it could be used for backups, but how does it work in detail? Won't it blow up exponentially if I have lots of files? How can I restore such an export in case of disk failure?
The export/ directory is dynamically generated, so there's no risk to saturate your disk because of it. It features a really basic structure. At the first level, you have all your tags. Each tag lists all the objects tagged by itself, as symbolic links to the archive/ directory. So basically if you do a tar or a zip of export/ and archive/, you're guaranteed a backup of your repository (bugs excepted: seems like triple tags are not properly listed in export/).
I don't have an automatic procedure or script for recovery. If you naively copy the export you make back into a restoring repository, tag by tag, you should end up with a copy of your original repo. However this approach is really expensive because each file could be deduplicated several times. A better solution would be (in pseudocode):
foreach tag {
foreach file in tag {
if (file exists in store/ALL/) {
mv store/ALL/file store/tag/ # just retag an existing file
} else {
cp export/tag/file store/tag/ # copy the missing file
}
}
}
Or something like this. You can precisely detect objects in ALL/ because in export/ everything start by its inode number.
Oh I see, so it is basically a "filesystem" dump of the database, indexed by the tags, where if a file is under say 3 tags it will be symlinked three times to the archive.
This is great. I'm currently migrating to a new system so I will put this to a test :)
Have you found it useful?
Heh, I don't actually remember how I migrated the data, but I have the repository working so I suppose I've used this method :D It's been some time.