pyfilesystem2
pyfilesystem2 copied to clipboard
Callback function for selecting which files to move/copy in a tree.
The current copy tree functionalities take in a Walker object, which is good for simple filtering of directories if you want to selectively copy files in a directory. The Walker functionality is very limited though and doesn't allow for custom logic to be applied, only file patterns can be excluded. Additionally, the movement logic doesn't allow any filtering, but I understand that this might be an intended thing.
I believe the Walker object would benefit a lot from a having an additional Callable
parameter. Something like Callable[[Info, str, FS], bool]
which is the file system resource, the path to the resource and the filesystem that the resource exists in respectively. I haven't put a lot of thought into the signature, but thats just a very general idea. If this Callable is provided, then on copying a tree every file will be passed through the callback function and if its True, we copy, else we do not. That way, a user can write custom code to filter files as they need.
Examples of use cases that I personally have encountered:
- Maximum/minimum file size
- Hash equality
- Permissions/ownership
- time created/age
- Antivirus scanning
- Content checking/mime type checking.
The only alternative I have at the moment (as far as Im aware) is to write an adapter to the filesystem adapter that does this for me, but the Walker seems like a very logical place to include a "filter_cb" type value and would allow me to actually use the copy_dir function instead of walking the desired filesystem to call the callback myself and calling copy_file. In addition, to get the code to the same quality without directly copy and pasting large portions (for threading and the likes) would be very unfortunate.
Looking at the move
functionality, it just copies and deletes. Why is the walker parameter not exposed?
https://github.com/PyFilesystem/pyfilesystem2/blob/master/fs/copy.py#L392 sounds fairly similar to what you're asking for? I wonder if you could extend https://github.com/PyFilesystem/pyfilesystem2/blob/master/fs/copy.py#L454 with a callback=None
parameter, and then add a elif condition == "callback":
branch to the function's logic? :man_shrugging: