fs icon indicating copy to clipboard operation
fs copied to clipboard

Skip existing files in file_copy(overwrite = FALSE)

Open kstierhoff opened this issue 5 years ago • 7 comments

Is there a reason why file_copy(overwrite = FALSE) must throw an error when a file already exists? Is it possible to allow the function to only copy files that don't exist, similar to file.copy(overwrite = FALSE)? Thanks for considering.

kstierhoff avatar Aug 12 '19 05:08 kstierhoff

Allowing this would complicate the API for limited benefit`.

file.copy() not failing when files exist has caused numerous unintentional bugs, avoiding this is one of the reasons the fs package exists.

However we could possibly add a fail argument like now exists for dir_ls(), which would issue a warning instead of failing if a file exists.

jimhester avatar Aug 12 '19 12:08 jimhester

I assumed there was a good reason for the current behavior. The option that you suggest would solve my problem, I think. Thanks for the explanation and suggestion!

On Mon, Aug 12, 2019 at 5:36 AM Jim Hester [email protected] wrote:

Allowing this would complicate the API for limited benefit`.

file.copy() not failing when files exist has caused numerous unintentional bugs, avoiding this is one of the reasons the fs package exists.

However we could possibly add a fail argument like now exists for dir_ls(), which would issue a warning instead of failing if a file exists.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/r-lib/fs/issues/213?email_source=notifications&email_token=AARG62U4YXVGMQVYUDMHTW3QEFKLZA5CNFSM4IK55632YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4CMPQY#issuecomment-520406979, or mute the thread https://github.com/notifications/unsubscribe-auth/AARG62TLORNFLC7NA2KDNRTQEFKLZANCNFSM4IK5563Q .

kstierhoff avatar Aug 12 '19 15:08 kstierhoff

I think this functionality would be really useful. Though I agree it should not be the default.

Either a fail argument, or for example, allow overwrite to take T, F, or "skip".

Is there a reason this was never implemented?

orgadish avatar Oct 20 '22 08:10 orgadish

FWIW (macos) cp has a -n option that skips existing files. I guess we could implement this, should we have time for it in the future.

It does seem like an anti-pattern, because after running

cp -r -n source target

you cannot be sure that source and target are the same, and this makes reasoning hard.

If the goal is to make target the same as source, then you probably also do not win much time with -n because copying any file within the same file system is an O(1) operation, practically always.

gaborcsardi avatar Oct 20 '22 08:10 gaborcsardi

My primary need for the skipping is precisely because I'm copying from a very slow filesystem to my local machine.

I don't think that the risk of files not being the same means it shouldn't be an option to skip, especially if it's not the default. I think it's ok to put some burden on the end-user. This could be even more explicit if there were a couple options, eg:

  • "skip_path" skips if the path exists.
  • "skip_info" skips if the file metadata is the same (eg from dir_info)

orgadish avatar Oct 20 '22 17:10 orgadish

My primary need for the skipping is precisely because I'm copying from a very slow filesystem to my local machine.

I don't think that the risk of files not being the same means it shouldn't be an option to skip, especially if it's not the default. I think it's ok to put some burden on the end-user. This could be even more explicit if there were a couple options, eg:

* "skip_path" skips if the path exists.

* "skip_info" skips if the file metadata is the same (eg from `dir_info`)

skip_info would be a great option!

vorpalvorpal avatar Feb 17 '23 00:02 vorpalvorpal

@vorpalvorpal Instead of copying the files locally to avoid dealing with the slow server, I ended up writing a cached_read function (https://github.com/orgadish/cachedread) which reads from the slow server once and writes a file locally. It has an equivalent option to skip_info that checks if the metadata is the same as before to determine whether to read from the local cache file or the server file.

I still think it would be useful to enable a "mirror tree" functionality, though. Perhaps it just warrants a new fs::mirror_dir function, rather than complicating the file_copy API, as @jimhester warned.

orgadish avatar Mar 24 '23 06:03 orgadish