aws.s3 icon indicating copy to clipboard operation
aws.s3 copied to clipboard

s3sync() download with prefix fails

Open leonidliu opened this issue 4 years ago • 3 comments

Possible bug. I'm trying to 'download' from s3sync() while supplying a prefix. The download fails on the first 'file' it tries to download, which appears to be an empty string.

Code:

library(aws.s3)
s3sync(bucket = "tmc-research-projects",
       prefix = "p039_fa_meta/",
       path = "~/Downloads/p039_fa_meta",
       direction = "download")

Last few lines of output, including the error:

names after prefix filter:
 [1] ""                                  "10/"                              
 [3] "10/civis/"                         "10/civis/civis_20201011.csv"      
 [5] "10/swayable/"                      "10/swayable/swayable_20201011.csv"
 [7] "civis_addl_subgroups.csv"          "civis_data.csv"                   
 [9] "recoded_tags.csv"                  "swayable_data.csv"                
10 bucket objects not found in local directory
<== Saving object 'p039_fa_meta/' to '~/Downloads/p039_fa_meta/'
Error in curl::curl_fetch_disk(url, x$path, handle = handle) : 
  Failed to open file /Users/leoliu/Downloads/p039_fa_meta.

Traceback:

> traceback()
8: curl::curl_fetch_disk(url, x$path, handle = handle)
7: request_fetch.write_disk(req$output, req$url, handle)
6: request_fetch(req$output, req$url, handle)
5: request_perform(req, hu$handle$handle)
4: httr::GET(url, H, query = query, write_disk, show_progress, ...)
3: s3HTTP(verb = "GET", bucket = bucket, path = paste0("/", object), 
       headers = headers, write_disk = httr::write_disk(path = file, 
           overwrite = overwrite), ...)
2: save_object(object = key, bucket = bucket, file = dst, ...)
1: s3sync(bucket = "tmc-research-projects", prefix = "p039_fa_meta/", 
       path = "~/Downloads/p039_fa_meta", direction = "download")

Session Info:

> sessionInfo()
R version 4.0.2 (2020-06-22)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Mojave 10.14.6

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] aws.s3_0.3.22

loaded via a namespace (and not attached):
[1] httr_1.4.2          compiler_4.0.2      R6_2.4.1            tools_4.0.2        
[5] base64enc_0.1-3     curl_4.3            aws.signature_0.6.0 xml2_1.3.2         
[9] digest_0.6.27      

leonidliu avatar Apr 23 '21 18:04 leonidliu

Did you create the p039_fa_meta/ prefix manually in the console? I have encountered similar errors and I think they are due to the fact that S3 must create that empty file as a placeholder, so the 'directory' can exist in S3 without any files. Deleting that empty file fixed this error for me.

wdwatkins avatar Apr 25 '22 18:04 wdwatkins

Since this is avoidable by just deleting the empty file or just uploading your files via an S3 client rather than creating directories manually, you could argue this isn't a high priority to fix in the package. It would be nice in some cases however.

If you want to reproduce yourself, simply 1) create a "directory" in the AWS console 2) try to sync with s3sync , observe the error about a file path ending with a period and the empty file in the bucket contents.

wdwatkins avatar Apr 25 '22 23:04 wdwatkins

Can confirm that this happened to me as well with a bucket created through the console. I believe that it would be worth fixing, and would be happy to contribute a fix if this project is still active!

jesse-ross avatar Sep 07 '23 23:09 jesse-ross