s3 put without loading binary to memory
From what I have seen thus far if I want to put an object onto a bucket in s3 i need to load it to memory first.
library(paws)
svc <- paws::s3(
config = list(
credentials = list(
creds = list(
access_key_id = Sys.getenv('AWS_ACCESS_KEY_ID'),
secret_access_key = Sys.getenv('AWS_SECRET_ACCESS_KEY')
)
),
region = Sys.getenv('AWS_DEFAULT_REGION')
)
)
x <- paste0(readLines(PATH_TO_FILE),collapse = '\n')
res <- svc$put_object(Body = x,Bucket = MY_BUCKET,Key = MY_KEY)
Is there a way to give put_* a path instead of an object?
Apologies in advanced if this is documented somewhere.
I have used something similar to this in the past. https://github.com/cloudyr/aws.s3/blob/master/R/put_object.R
Unfortunately we don't have that yet. We're working on it, but I can't say when it'll be available.
ok. thank you
It's too bad it's not really documented too. I've spent quite a lot of time trying to save a parquet file into a bucket. It seems like you also unassigned yourself, @davidkretch - does it mean it won't be implemented?
I can't say when it's going to happen unfortunately. The person who was originally working on this started a new job. I thought I had the time for it but it has turned out not to be the case recently :-(. I'll see if I can dig up an example at least this weekend though.
Hello @yonicd and @DavidArenburg, sorry for the delay. There is now an example of using multipart upload to read in and upload files in 5 MB parts here: examples/s3_multipart_upload.R.
This may eventually be added to the package but I'm not sure yet. Please let me know if you have any issues. Thank you!
What if Body could also be an object returned by httr::upload_file()? This doesn't help for very large files that still need multipart uploads, but does eliminate streaming data through R.
(I'm also happy to look into implementing this, but I'd need some guidelines on how you think about R specific extensions/wrappers to aws APIs)