cbrain
cbrain copied to clipboard
Automatic CbrainTask work directory cleanup (redmine 602)
A CbrainTask launched on a cluster gets a work directory created for it. The directory stays as long as the task is registered in the DB. When a user selects "remove task" in the task index page, the directory is cleaned up.
I suggest we implement a new time-to-live parameter for each cluster. The cluster's tasks would be automatically removed (and their directory cleaned) after a certain number of days. Since some users would probably want to keep some task and their directories permanently, we also need to provide them with an interface where they can mark tasks with a flag 'to keep'; such tasks would never be automatically cleaned up even after they've expired their time-to-live.
ANTON, 5 years ago:
Why would users want to keep task dir permanently on the cluster? We should see what kind of policies each cluster have regarding use of space. Why is there a need for the dir to hang around after results have been saved?
I think we shouldn't permanently delete tasks in DB. We can mark them as deleted and treat them differently but we should keep them. My reasons:
- user accidentally deletes a task (now there's a way to undo it)
- data provenance (every result can point to a task that produced it, task will have params used, revision, etc.)
- user history
Pierre, 5 years ago:
When I wrote this issue I was thinking forward to some future design that no-one else is aware of. Sorry. The reason I expect some users would like to keep a work directory is that in the future these could embody a persistent 'system' where new tasks will be started again and again, or the same work directory used by different DrmaaTasks. I'm designing this right now.
Cluster policies are to be taken into account to make this work, though. Usually, the shared directories that are accessible by all nodes are NOT meant for persistent storage. Instead, clusters provide a different, less performant filesystem for this, and often these are not shared with the nodes.
So if we want users to keep a task's work directory persistently, we may have to provide a 'move to permanent storage' button (back and forth, if the task is to be re-activated later). Or better, make this automatic, as needed.
Pierre, 3 years ago:
The new archiving system could be used, now.
I see a new section in the 'control' area of a task. A selection box where the user can decide what happen to the task. Possible options:
Multiple selection boxes:
- If Completed:
- Destroy task
- Archive on cluster
- Archive as file
- If Failed:
- Destroy task
- Archive on cluster
- Archive as file
(Option: number of days before action is to be triggered)
Something like that.