gcsfs
gcsfs copied to clipboard
Confusion over paths in put(recursive=True)
I'm facing some confusion over how paths are handled in fs.put(recursive=True). In addition, the upload location seems different depending on whether the remote path exists.
If I run this code from ~/workspace/project
fs = gcsfs.GCSFileSystem()
fs.put("./data/transformed/", "gs://project/data/transformed/", recursive=True)
Running prior to any files in that bucket, the files are created at gs://project/data/transformed/orkspace/project/data/transformed/. The path is repeated and workspace is missing its first character.
Once the folder is there, then I get this error, as though the upload is using the whole path starting with Users as the destination:
ValueError: Bad Request: https://www.googleapis.com/upload/storage/v1/b/Users/o
Invalid bucket name: 'Users'
My intention is to load the files to gs://project/data/transformed/.
I tried changing the second argument to gs://project/, which stopped the path being repeated, but kept the full relative path less the leading character and the error on the second call.
Let me know if I'm making a basic mistake, thank you.
You should not use relative paths, fsspec is not a shell language; although I suppose it would be reasonable to add abspath to the code of get/put.
Also, it is consistent that the behaviour would be different depending on whether the target path is an existing directory or not, cp -r (posix) does the same thing. I must admit that I've been annoyed by hhat many times before now.
Also, it is consistent that the behaviour would be different depending on whether the target path is an existing directory or not,
cp -r(posix) does the same thing. I must admit that I've been annoyed by hhat many times before now.
👍 , good point
You should not use relative paths, fsspec is not a shell language; although I suppose it would be reasonable to add
abspathto the code of get/put.
OK, I do often use relative paths in python, and the example works well with relative paths without recursive=True