amazonka icon indicating copy to clipboard operation
amazonka copied to clipboard

Apparent leak when sending repeated requests to Glacier

Open LeifW opened this issue 6 years ago • 6 comments

To be able to upload arbitrarily large archives to AWS Glacier, I wrapped its multipart upload in a Conduit of ByteStrings. However, it still uses more memory at once than the size of the entire archive. The culprit seems to be the send $ uploadMultipartPart ... call run inside of a Counduit.mapM on the conduit. http://hackage.haskell.org/package/amazonka-glacier-1.6.0/docs/Network-AWS-Glacier-UploadMultipartPart.html If I replace uploadMultipartPart uploadId byteRange checksum chunk with: liftIO $ BS.appendFile "/tmp/out" chunk or even liftIO $ runResourceT $ runReaderT (uploadMultipartPart uploadId byteRange checksum chunk) (GlacierEnv env glacierSettings) the ballooning memory goes away.

LeifW avatar Jun 27 '18 03:06 LeifW

This happens to me with using the bytestring conduit in amazonka-s3-streaming, as well

LeifW avatar Jun 27 '18 04:06 LeifW

Looking at https://github.com/snoyberg/http-client/blob/9eb92877641db53efa179ea871a51d32989c6f52/http-conduit/Network/HTTP/Conduit.hs#L315, it looks like the result of Client.responseOpen is freed when you call responseBody on the response, or when you runResourceT. I don't know if responseBody is being called on these responses in Amazonka, but running runResourceT around every request fixes the leak. So I don't want to have AWSConstraint or MonadAWS floating around my application - just immediately wrap all requests in runResourceT?

LeifW avatar Jun 27 '18 06:06 LeifW

It looks like UploadMultipartPartResponse calls receiveEmpty. Other responses that call receiveXML or receiveJSON call responseBody, but I don't see receiveEmpty doing so. Of course, the responses to these calls shouldn't be sizeable, or have any body for that matter - you'd think it'd be the requests that are leaking space if anything.

LeifW avatar Jun 27 '18 06:06 LeifW

Other effort to use Conduits with multipart uploads also seem to leak memory: https://github.com/axman6/amazonka-s3-streaming/pull/22

Seems like there's no obvious easy fix, so this can go on the "post 2.0" pile.

endgame avatar Oct 04 '21 01:10 endgame

#523 might be another related laziness bug; I suspect we might have to force something during send.

https://www.snoyman.com/blog/2017/01/foldable-mapm-maybe-and-recursive-functions/ makes me suspect we might want/need to provide a "force and ignore result" variant of send .

endgame avatar Apr 17 '24 07:04 endgame