s3
s3 copied to clipboard
Efficiently writing s3 object to file
s3 really needs a way to stream content to a file somehow. Loading .content on a large file pretty much puts everything in memory and destroys any heroku worker that it touches. (they cap at 300mb memory limit)
Something like this would be awesome:
s3obj = s3_bucket.objects.find("my_huge_object.mov")
s3obj.write_to_file("/tmp/my_huge_object.mov")
especially if it could avoid loading the entire content into memory at once
This old s3 wrapper has a S3Object.stream
http://amazon.rubyforge.org/
maybe we could take some of the code from there?
Yeah, there should be such possibility when downloading objects indeed. I've recently added upload streaming, I'll try to take a look at downloads as well. If you have some idea of solving it, you can write appropriate patch ;-).
Thanks for suggestion.
Sounds good. I'm not too good at this type of stuff, but I can try to hack together some code taken from marcel's library. Might make more sense to let you handle it though ;-)
Ok, after digging around a bit, it looks like its pretty easy. You just have to call read_body on the HTTPResponse and give it a block.
so the change would be in parse_headers. It should avoid explicitly calling response.body as this belongs in the content accessor.
then we can add a stream_content accessor that takes block which is then passed along to HTTPResponse
piece of cake, i'll submit a patch
Sounds great! :-)
Up ! :)