coreutils
coreutils copied to clipboard
excessive memory usage in more
Running it over a test file of about 1G in size results in 15G RSS.
Problems start with slurping the entire thing upfront, I don't know how it manages to amass 15 x memory overhead on top of it.
more from gnu coreutils sits at around 2MB.
How to repro:
- generate a file of about 1G:
perl -e 'print "meh\n" x (256 * 1024 * 1024)' > testfile cargo run -r testfile
note that we have been focusing on compatibility, not perf or memory usage for now (even if some programs are already doing better than GNU's)
True, but I also agree that 15x the filesize is a bit excessive 😄
Even 100% would be excessive, consumed memory shouldn't scale linearly with the file size at all.
It's also just extremely slow to load such a file. I'm looking into making this at least a bit better.
note that we have been focusing on compatibility, not perf or memory usage for now (even if some programs are already doing better than GNU's)
For coreutils sensible memory usage is part of correctness, especially so for a tool like more.
For example say I logged in to a production machine with the intent of running more /var/log/log_file_sized_1G_or_more. Suppose the the explosive memory usage problem is fixed, but the entire file is still slurped upfront, resulting in 1G RSS.
If the machine I'm running this on does not have a free 1G we are running into trouble. And chances are things are already going wrong, hence why I'm logging in to poke around (including reading logs) in the first place.
The gnu coreutils variant uses few MB of ram making itself a perfectly viable tool in that scenario.
That is to say this code needs to deal with the file in chunks and at the moment is not fit for use whatsoever. Any work done on it which would be thrown away while making the adjustment is a waste of time, which would be better spent either reworking this or beating other tools to shape in a lasting manner.
Not useful for you does not mean useless for everyone. We've already agreed this is an issue and want to fix it. Your point has been made.
Edit: For a bit more context: there are many more parts to get right here that are independent of this issue: the keyboard input, the rendering, the argument parsing, etc. Reworking the way the buffer works can still be while other changes are being made.