hackney
hackney copied to clipboard
Current chunked encoding implementation triggers extensive copying
When hackney fetches a chunked response, it uses the very same binary to accumulate incoming data and match for a chunk size. In general, it is a clearly documented anti-pattern that causes extensive copying and produces enormous amounts of garbage.
In this particular case for chunks that are much bigger than MTU (and therefore bigger than binaries sent by gen_tcp) this may happen many times. It caused our application to consume tens of gigabytes of RAM while fetching mere hundreds of megabytes from an upstream service. Besides this behavior is likely the cause for #77.
I suggest introducing an additional "inside chunk" state to avoid peeking inside a data-binary every time. My initial intention was to send a pull request, but I wasn't able to find a test that covers that part of code, therefore I decided to simply file this issue.
For now, we solved the issue by switching to ibrowse that does not manifest the described problem.
In which case I can reproduce it?
I'm also not sure what you mean by "inside chunk". Can you elaborate?
Anyway i'm not sure it is that clearly documented ;) I think a better way instead of parsing the whole binary would be waiting for the end of the line using the socket options so we wouldn't have to accumulate in a binary. However it wouldn't allow the possibility to send a partial chunk.
i'm curious about the patch you have in mind. can you at least paste it there if you don't feel comfortable to send a pr?
In which case I can reproduce it?
The setup described in #379 is pretty much how I ran into this.
I'm curious about the patch you have in mind. can you at least paste it there if you don't feel comfortable to send a PR?
Currently te_chunked does not use it's "state" (like te_identity does for instance). Instead of repeatedly inspecting insides of a binary the number of bytes to be received might be kept in this internal state. This way binary matching will only be necessary for the first and the last steps. It is also compatible with the chunk-splitting approach that I have described in #379.