fdir icon indicating copy to clipboard operation
fdir copied to clipboard

Infinite RAM usage can exhaust system

Open paulmillr opened this issue 1 year ago • 6 comments

RAM usage seems to be unlimited with the module. RAM requirements need to be clarified somewhere, because crashing apps is not ok.

paulmillr avatar Jul 11 '24 01:07 paulmillr

Fdir just crawls the directories, and yes, if you have an insane amount of items it may end up OOM-ing your app. What's the suggested outcome here?

thecodrr avatar Jul 11 '24 05:07 thecodrr

Readdirp have limits on ram usage due to its architecture. It can’t exceed X.

paulmillr avatar Jul 11 '24 11:07 paulmillr

readdirp will just emit files/dirs, right? so it won't really use much memory, as it isn't building up a list of found paths

if the user turns the result into an array, they will hit the same issue fdir has.

you can set maxFiles possibly to limit the array size, so you won't hit any OOM issues

alternatively, maybe we could introduce an option to pass a callback which will be called each time we visit a path, and we skip building up the internal set of paths (which means you can't use group etc)

43081j avatar Aug 21 '24 11:08 43081j

Well, stream APIs exist for a reason. Not always users want an array of 400K files. etc.

paulmillr avatar Aug 21 '24 11:08 paulmillr

Yes I think I'm completely missing the point of this package. It's fast so presumably you see the most benefits when crawling large directories... but then it returns millions of files without any way to process the data using a stream or callback API?

Again not seeing the point at all. It's nice if you only have a few files I guess but then why bother with a third-party dependency, just use readdir.

laurent22 avatar Feb 26 '25 21:02 laurent22

Again not seeing the point at all. It's nice if you only have a few files I guess but then why bother with a third-party dependency, just use readdir.

readdir is pretty slow so...if you don't care about performance then sure.

fdir shouldn't give you memory problems unless you are crawling trillions of files (which to be honest is only an edge case). Actually, try crawling the root directory on your system and see if you run out of memory.

any way to process the data using a stream or callback API?

Stream APIs are inherently slow. I tested yield performance and it was particularly wanting. The point of fdir is to crawl and give you all the files in a directory fast. What you do afterwards with the paths is up to you.

It's technically possible to run out of memory with fdir but you'll have other concerns at the point at which that becomes an issue.

thecodrr avatar Feb 27 '25 16:02 thecodrr