bos icon indicating copy to clipboard operation
bos copied to clipboard

Improve `OS.{Dir,Path}.delete ~recurse:true` reliability.

Open dbuenzli opened this issue 9 years ago • 2 comments

W.r.t. races, see https://github.com/dbuenzli/bos/issues/50#issuecomment-223592710 and discussion in #50.

dbuenzli avatar Jun 06 '16 18:06 dbuenzli

I think the code in this gist could make an improvement over the current convoluted code.

It should have the reliability @dsheets wanted. It will delete everything in the directory even if someone fiddles with the path and its content; e.g. by removing files, adding files, replacing directories with files and vice-versa (except if the toplevel directory is replaced by a file, it could be made to work by adding a few more cases but I think it's fine that way). This however also means that another process can make this function loop if it manages to be faster.

The strategy is to use rmdir and that whenever a non-empty directory is hit you open it and start unlinking the files, if you hit a directory in this process you close the directory and start deleting that one and come back later to finish your job. Once you manage to delete the contents of a dir you try to rmdir and restart if that is not empty again.

Note that this means that to delete a directory with n subdirs you need to Unix.{open,close}dir at least n times, it's a little bit unclear what the performance impact of this is.

dbuenzli avatar May 05 '17 16:05 dbuenzli

An unscientific benchmark on a large directory indicates that this technique is a bit slower:

> find $(odig cache path) | wc -l 
   28664
> time rm -r $(odig cache path)

real	0m1.800s
user	0m0.095s
sys	0m1.632s

> time bosrmdirr $(odig cache path) # current bos implementation

real	0m1.994s
user	0m0.131s
sys	0m1.781s

> time b0rmdirr $(odig cache path) # proposed alternative

real	0m2.330s
user	0m0.163s
sys	0m2.103s

dbuenzli avatar May 05 '17 18:05 dbuenzli