mod_zip icon indicating copy to clipboard operation
mod_zip copied to clipboard

Excessive memory usage

Open hading opened this issue 6 years ago • 9 comments

This is with nginx 1.8.1 and mod_zip 1.1.6.

I'm trying to download a zip package for around 170,000 files and 100GB.

I use the workaround I described in issue #42 because of the large number of files we sometimes deal with, but this doesn't seem related.

When I watch under top, nginx just uses all the memory of the machine. I tried this with a machine with as much as 160GB of memory, and nginx used it all in less than a minute. This seems to be happening while it is dealing with the manifest file; the second nginx server in the above workaround never gets hit before memory is exhausted.

I don't see anything in the configuration that I think would be causing this problem. I will attach a minorly scrubbed version:

nginx.conf.txt

Doing some other trials the problem seems to be with the number of files to go into the zip, not the overall size in bytes.

When I was developing the code that uses this (beginning of 2016) I had experimented with some even larger (in terms of number of files) downloads than this, and they were fine, but I don't really have good records to suggest what might be different.

Honestly I'm at a loss to think how nginx even could consume that much memory doing this, but I watched it happen.

hading avatar Mar 30 '18 17:03 hading

Let me add another piece of data: I was able to do a 40,000 file, 130GB zip on a server with 4GB memory and 4GB swap. So whatever is going on here was not scaling the way I expected. (And even that one still used almost all of the resources of the server, which still seems a lot.)

hading avatar Mar 30 '18 17:03 hading

I've been working on the project that I have that uses this and was reminded of the issue.

I wonder if there is something about the particular file list (either just in raw size or with it having some sort of peculiarity) that is is causing the Ragel parser to blow up, which seems possible based on a bit of searching.

I have an example file list that blows it up, but I want to see how far I need to cut it down to get it to work to produce a more minimal example (that will blow up my 4GB mem/4GB swap machine). It is about 170000 files (essentially the same one I mentioned initially).

hading avatar May 24 '18 15:05 hading

The file that blows up my machine has the following properties:

hding2$ wc manifest.txt.bad
   12500   86557 7531526 manifest.txt.bad

So it seems like parsing this 7.2MB file is taking over 8GB. Taking it down to 12250 lines it can do it, just barely. If I remember I'll attach the file once the AWS links in it have expired in a week, but it's just a lot of lines like:

- 95308 /long/aws/url?with_lots_of_query_parameters /long/path/to/zip/location

Does the parsing try to do the whole file at once rather than splitting into individual lines on '\r\n' and then parsing each individual line?

hading avatar May 24 '18 16:05 hading

You can see the parser logic here:

https://github.com/evanmiller/mod_zip/blob/master/ngx_http_zip_parsers.rl

It's possible that the grammar is ambiguous, and Ragel blowing up with internal state (i.e. for backtracking). You might try recompiling the .rl file with other Ragel options, e.g. -G2, and seeing if that fixes things.

evanmiller avatar Jun 20 '18 17:06 evanmiller

I think that it's possible that what is going on here is that the URI field allows \r and \n and that is forcing a lot of backtracking information to be kept.

I'm going to try regenerating things with that possibility removed and then try it on one of my test cases to see if there is an improvement.

hading avatar Jun 20 '18 19:06 hading

No, that was not enough. I'm not really familiar with Ragel, so I'll look at it some more to see if it provides any facilities that I think might help.

hading avatar Jun 20 '18 19:06 hading

hi @evanmiller , do you see another possible source of this problem besides the Ragel parser?

nahuel avatar Jan 29 '20 18:01 nahuel

I've figured out the bug; it's not the parser.

Every time mod_zip reads a chunk of input (about 8KB), it allocates a new buffer, copies old all contents, but does not free the old buffer: https://github.com/evanmiller/mod_zip/blob/master/ngx_http_zip_module.c#L153-L161

This results in an O(N²) memory usage, so even 10MB of input will take up GBs of memory.

Freeing the old buffer fixes it - though it's still inefficient in terms of CPU. Better yet, just use ngx_array_t, which doubles in size as needed. (It actually still leaks memory, but only O(N)...)

Here's a fix: https://github.com/quiltdata/mod_zip/commit/1b6813a030d71ee1279e3cb87e87516875e7c08e. If that looks reasonable, I'll send out a proper PR.

dimaryaz avatar Aug 05 '22 07:08 dimaryaz

@dimaryaz Excellent find! Yes feel free to open a PR. An array seems like an unusual data structure to hold a string, but it seems like it will work.

evanmiller avatar Aug 05 '22 11:08 evanmiller