node-fast-download icon indicating copy to clipboard operation
node-fast-download copied to clipboard

May use very much memory when chunksAtOnce > 1

Open ProtocolNebula opened this issue 6 years ago • 5 comments

If chunksAtOnce is greater than 1, memory start to increase until file is downloaded, so if file is big node exit with code 137 or similar.

Posible solution: Store chunks temporaly in file instead of RAM and merge after download.

ProtocolNebula avatar Jan 09 '19 14:01 ProtocolNebula

@ProtocolNebula how are you profiling this?

zenflow avatar Jan 09 '19 15:01 zenflow

I'm using the example wrapped into a promise and linked to progress bar (in typescript).

The only think that can be "wrong" in my code is:

dl.on('data', function (chunk) {
                try {
                    bar.tick(chunk.length);
                } catch (ex) { }
            });

I'm trying to download really big files (from 50MB to 2GB) into raspberry pi (1GB memory maximum). If only 1 chunk setted, it pipe all code to file, so the full memory compsumition of the app is around 50MB.

I try to upload this week the wrapper to github/npm.

ProtocolNebula avatar Jan 09 '19 22:01 ProtocolNebula

Ok, so this is what should be happening (and so far we have no reason to believe that it's not working properly, because it does potentially consume a lot of RAM):

The leading chunk is streamed right through (nothing buffered into memory), and while that is happening, the following chunks are buffered into memory until the stream reaches their position.

And because of the specific algorithm used, it's possible for many chunks (more than chunksAtOnce) to be buffered in memory if the leading chunk hasn't completed yet, but the other following chunks have completed. For example:

chunksAtOnce = 3

| chunk 1 | chunk 2 | chunk 3 | chunk 4 | chunk 5 | chunk 6 | |##-------|#######|#######|######|#####---|###------|

Data received for chunks 2 to 6 is all buffered in memory. Only the data received for chunk 1 has been streamed through.

zenflow avatar Jan 13 '19 19:01 zenflow

Posible solution: Store chunks temporaly in file instead of RAM and merge after download.

That's a good idea, except it would be a major revision (almost a rewrite) of this package, and I'm not much interested in continuing major work on this package.

If you want to work on this, my suggestion is to create your own fresh new package, and use as much or as little of the code here as you want.

Or you can use one of the other packages on npm that do the same thing as this package, but already allow you to buffer downloaded data in the filesystem as opposed to RAM:

  • https://www.npmjs.com/package/mt-downloader
  • https://www.npmjs.com/package/multipart-download

zenflow avatar Jan 13 '19 19:01 zenflow

I was looking for some package like this but not found, I will try with those tow. Thanks!

ProtocolNebula avatar Jan 13 '19 23:01 ProtocolNebula