aiida-core
aiida-core copied to clipboard
`verdi storage maintain`: include the cleaning of the sandbox folders
Currently, the sandbox folders are in a subfolder of the directory used for the file repository. These are not automatically cleaned and over time can accumulate old files. It would be good to be able to clean them through verdi storage maintain perhaps. Note that if #5496 is merged, the location of the sandbox folders will by default become the temp folder of the OS and will be configurable with the storage.sandbox option.
Just to make sure I understand: after #5496 verdi storage maintain should only do this cleanup if a storage.sandbox option exists in the configuration of that backend, if it is the default temp location it should just ignore it, right? (since there could be stuff used by other programs there).
Also, considering this, we should instruct users to only specify a location there that will be used exclusively for AiiDA, for the same reason as above (risk of deleting temp files being used by another program).
Just to make sure I understand: after #5496
verdi storage maintainshould only do this cleanup if astorage.sandboxoption exists in the configuration of that backend, if it is the default temp location it should just ignore it, right? (since there could be stuff used by other programs there).
Yes, exactly. Another potential pitfall is that if the user changes the setting before calling the maintain, the old location might keep uncleaned folders. But I don't think this is too big of a problem.
Also, considering this, we should instruct users to only specify a location there that will be used exclusively for AiiDA, for the same reason as above (risk of deleting temp files being used by another program).
This is a very good point, and so now I am not even sure whether we should have this functionality. Or at the very least, it should provide an explicit warning with a confirm.
Thinking about it some more, the config options can be profile-specific but also config wide. We should be careful what should happen if it is defined config wide. Not sure if we should then also clean it. Because the lock is profile-specific, it is possible that another profile is running there and creating sandbox folders.
Thinking about it some more, the config options can be profile-specific but also config wide. We should be careful what should happen if it is defined config wide. Not sure if we should then also clean it. Because the lock is profile-specific, it is possible that another profile is running there and creating sandbox folders.
Yes, nice catch, I was thinking other programs but indeed the condition is stricter than that, and the location should actually be profile specific to avoid potential problems.
So, potential solutions (or are these actually further complications? mmm) I can think of:
- Is there a way to disable this as a config wide option and only allow it to be profile specific?
- Before setting this config, AiiDA could check that the location is empty unless it already contains a
.aiida_sandboxfile with the name of the profile stored inside (preventing raising during re-assignments or stuff like that). Or maybe auuidfor the profile? Do we have fulluuidsto single out instances of AiiDA databases/profiles? Should we start considering these? (I'm thinking the case of having the same named profile in two differentAIIDA_PATHS) - When changing this setting (if it was set to a given destination before, can ignore if it was empty) it should warn the user that the old location will no longer be tracked by AiiDA, so they should have either
maintained fullbefore* or make sure to manually clean it up (and ask for confirmation).
*Perhaps we also need a specific command/option to only do the sandbox cleanup, so that people can just lock to quickly clean this up instead of having to go through all the full maintenance?
Is there a way to disable this as a config wide option and only allow it to be profile specific?
No, only the opposite, i.e., a global-only option. Regarding your other suggestions, there isn't really any infrastructure in place to add these kinds of complex validations when setting config options. And I doubt whether we should start adding them now. All of these discussion really made me doubt whether we should even implement this feature request. Maybe just documenting the concept of the sandbox folders and telling people to manually clean it now and again when necessary is good enough.
Fair enough, I assume that means this is on hold until further notice. I'll just ping @giovannipizzi since he prompted for the feature in case he has anything to add that we are not considering.