hackney icon indicating copy to clipboard operation
hackney copied to clipboard

Unexpected `enomem` error during file downloading

Open viralpraxis opened this issue 1 year ago • 2 comments

Hi there!

We tried to migrate from 1.18.1 to 1.20.1 and encountered increased error rate during downloading huge (~1GB) files.

I've published an MRE: https://gist.github.com/viralpraxis/6209746ee3b108473452ee810678f769 You can run it via mix run -e 'Runner.start($URI)', $URI points to a file over 1GB (serving over python -m http.server would be fine).

On 1.18.1:

21:47:47.878 [info] 2: Streaming completed
21:47:47.946 [info] 1: Streaming completed
21:47:47.990 [info] 5: Streaming completed
21:47:48.055 [info] 4: Streaming completed
21:47:48.066 [info] 3: Streaming completed

On 1.20.1:

21:48:12.494 [error] Error while streaming: enomem
21:48:12.494 [error] Error while streaming: enomem
21:48:12.494 [error] Error while streaming: enomem
21:48:12.494 [error] Error while streaming: enomem
21:48:12.494 [error] Error while streaming: enomem

In fact this error is reproducible on 1.18.2. I bisect'd this error to https://github.com/benoitc/hackney/commit/5e74354a48653fe2456688f80c6bccb11143f6af (previous commit https://github.com/benoitc/hackney/commit/e3872f768a4f0b74c20a03c5e23ea9652d811f0e works fine).

elixir 1.16.1
erlang 26.2.2

Please let my know if you need any addition information. Thanks!

viralpraxis avatar Jun 20 '24 18:06 viralpraxis

do you reproduce it in latest version ?

benoitc avatar Jun 24 '25 20:06 benoitc

@benoitc just checked out 1.24.1 -- this is still valid.

viralpraxis avatar Jun 24 '25 20:06 viralpraxis

I can confirm that downgrading to 1.18.1 fixes the enomem issue for me. In my case, it's about downloading 100 MB file from Azure Blob Storage (azurex -> httpoison -> hackney).

elixir 1.18.4 & erlang 27.3.4.1

zrzka avatar Jul 14 '25 09:07 zrzka

This is interesting ...

  • Before we bumped Hackney to 1.24.1, we had it pinned to eca5fbb1ff2d84facefb2a633e00f6ca16e7ddfd
  • Then we bumped Hackney to 1.24.1, Elixir to 1.18.4 (was 1.16.2) and Erlang to 27.3.4.1 (was 26.2.2) and enomem issue appeared
  • I did downgrade Hackney to 1.18.1 and the enomem issue is gone
  • I bumped Hackney to eca5fbb1ff2d84facefb2a633e00f6ca16e7ddfd (basically 1.20.1 + some fixes) and the issue is back

... IOW, the pattern I see here is ...

  • enomem in Elixir 1.18.4 & Erlang 27.3.4.1
  • no enomem in Elixir 1.16.2 & Erlang 26.2.2

zrzka avatar Jul 15 '25 20:07 zrzka

I have also encountered this today. A couple of data points:

  • The file size is 104MiB, so not huge
  • we use ExAWS.S3, which leverages hackney as its default client
  • for some unknown reason, it happens on my Mac, but not on our production (Ubuntu Noble) docker container (yet at least)

For comparison, we are on:

  • Erlang/OTP 27 [erts-15.2.7] [source] [64-bit] [smp:16:16] [ds:16:16:10] [async-threads:1] [jit]
  • Elixir 1.18.4 (compiled with Erlang/OTP 27)
  • hackney 1.24.1 if I'm correct

thbar avatar Jul 17 '25 18:07 thbar

Alright, I took a closer look at suspicious commit https://github.com/benoitc/hackney/commit/5e74354a48653fe2456688f80c6bccb11143f6af. It seems that Hackney attempts to allocate N bytes of memory when downloading a file of size N. This means that downloading a 1GB file would try to allocate 1GB of memory, which is very likely to fail.

https://github.com/benoitc/hackney/blob/e2bbdf741ee374c872da2baadc7451b66644b421/src/hackney_response.erl#L369

I'm fairly pretty sure it should be removed and a more reasonable buffer size (e.g., 512KB, or possibly just default zero value) should be used.

viralpraxis avatar Jul 17 '25 20:07 viralpraxis

I've opened https://github.com/benoitc/hackney/pull/774. I'm not familiar with the codebase, so I'm not sure if it's correct. But at least it resolves the enomem issue (tested on 2GB downloading) and I hope it might help move us toward a valid fix more quickly.

viralpraxis avatar Jul 17 '25 20:07 viralpraxis

It seems that the amount of data is constrained by TCP C port driver. It forbids to ask for packets larger than 64MB:

https://github.com/erlang/otp/blob/359e254aba76c1986b671b45fd320c6cc6720ca8/erts/emulator/drivers/common/inet_drv.c#L1297 https://github.com/erlang/otp/blob/359e254aba76c1986b671b45fd320c6cc6720ca8/erts/emulator/drivers/common/inet_drv.c#L12177-L12178

bszaf avatar Jul 21 '25 11:07 bszaf

Seeing

%HTTPoison.Error{reason: :enomem, id: nil}

when GETting a ~75MB file from Google Storage bucket.

elixir 1.18.4 erlang 28.0.1 hackney 1.24.1

grzuy avatar Jul 25 '25 20:07 grzuy

Yeah, apparently this happens to any file >= 64MB. I believe https://github.com/benoitc/hackney/pull/774 should fix that.

viralpraxis avatar Jul 28 '25 13:07 viralpraxis

For what is worth, there seems to be this other pull request (https://github.com/benoitc/hackney/pull/746) from a while ago also trying to fix some apparent buggy behavior coming from the same original "Body parsing optimization" (https://github.com/benoitc/hackney/pull/710) .

I wonder whether that also fixes this enomem issue or not...

grzuy avatar Jul 28 '25 16:07 grzuy