aiida-core icon indicating copy to clipboard operation
aiida-core copied to clipboard

`verdi storage maintain`: include the cleaning of the sandbox folders

Open sphuber opened this issue 3 years ago • 6 comments

Currently, the sandbox folders are in a subfolder of the directory used for the file repository. These are not automatically cleaned and over time can accumulate old files. It would be good to be able to clean them through verdi storage maintain perhaps. Note that if #5496 is merged, the location of the sandbox folders will by default become the temp folder of the OS and will be configurable with the storage.sandbox option.

sphuber avatar May 04 '22 08:05 sphuber

Just to make sure I understand: after #5496 verdi storage maintain should only do this cleanup if a storage.sandbox option exists in the configuration of that backend, if it is the default temp location it should just ignore it, right? (since there could be stuff used by other programs there).

ramirezfranciscof avatar May 04 '22 09:05 ramirezfranciscof

Also, considering this, we should instruct users to only specify a location there that will be used exclusively for AiiDA, for the same reason as above (risk of deleting temp files being used by another program).

ramirezfranciscof avatar May 04 '22 09:05 ramirezfranciscof

Just to make sure I understand: after #5496 verdi storage maintain should only do this cleanup if a storage.sandbox option exists in the configuration of that backend, if it is the default temp location it should just ignore it, right? (since there could be stuff used by other programs there).

Yes, exactly. Another potential pitfall is that if the user changes the setting before calling the maintain, the old location might keep uncleaned folders. But I don't think this is too big of a problem.

Also, considering this, we should instruct users to only specify a location there that will be used exclusively for AiiDA, for the same reason as above (risk of deleting temp files being used by another program).

This is a very good point, and so now I am not even sure whether we should have this functionality. Or at the very least, it should provide an explicit warning with a confirm.

Thinking about it some more, the config options can be profile-specific but also config wide. We should be careful what should happen if it is defined config wide. Not sure if we should then also clean it. Because the lock is profile-specific, it is possible that another profile is running there and creating sandbox folders.

sphuber avatar May 04 '22 09:05 sphuber

Thinking about it some more, the config options can be profile-specific but also config wide. We should be careful what should happen if it is defined config wide. Not sure if we should then also clean it. Because the lock is profile-specific, it is possible that another profile is running there and creating sandbox folders.

Yes, nice catch, I was thinking other programs but indeed the condition is stricter than that, and the location should actually be profile specific to avoid potential problems.

So, potential solutions (or are these actually further complications? mmm) I can think of:

  • Is there a way to disable this as a config wide option and only allow it to be profile specific?
  • Before setting this config, AiiDA could check that the location is empty unless it already contains a .aiida_sandbox file with the name of the profile stored inside (preventing raising during re-assignments or stuff like that). Or maybe a uuid for the profile? Do we have full uuids to single out instances of AiiDA databases/profiles? Should we start considering these? (I'm thinking the case of having the same named profile in two different AIIDA_PATHS)
  • When changing this setting (if it was set to a given destination before, can ignore if it was empty) it should warn the user that the old location will no longer be tracked by AiiDA, so they should have either maintained full before* or make sure to manually clean it up (and ask for confirmation).

*Perhaps we also need a specific command/option to only do the sandbox cleanup, so that people can just lock to quickly clean this up instead of having to go through all the full maintenance?

ramirezfranciscof avatar May 04 '22 10:05 ramirezfranciscof

Is there a way to disable this as a config wide option and only allow it to be profile specific?

No, only the opposite, i.e., a global-only option. Regarding your other suggestions, there isn't really any infrastructure in place to add these kinds of complex validations when setting config options. And I doubt whether we should start adding them now. All of these discussion really made me doubt whether we should even implement this feature request. Maybe just documenting the concept of the sandbox folders and telling people to manually clean it now and again when necessary is good enough.

sphuber avatar May 04 '22 12:05 sphuber

Fair enough, I assume that means this is on hold until further notice. I'll just ping @giovannipizzi since he prompted for the feature in case he has anything to add that we are not considering.

ramirezfranciscof avatar May 04 '22 14:05 ramirezfranciscof