Try reading line-by-line from the buffer we read in App::Ack::File::may_be_present()

Open petdance opened this issue 6 years ago • 2 comments

may_be_present reads in 10M of the file, and if it is, then the code that calls it resets the file handle and reads line-by-line.

Can we be smart enough to iterate over the buffer we just read in may_be_present? Can we iterate over it more quickly than it would take to reread the file from disk?

Or will it not matter because Perl is only rereading from the filesystem anyway?

Sep 01 '19 04:09 petdance

The OS file buffer caching should make this nearly free, it's RAM to RAM copy of a few KB and then copying the line buffer. Unless we do low-level XS stuff we'd have to copy lines out of the 10M sysread buffer anyway so <> copying buffers from cache seems tolerable.

(In the old days, iterating with index and substr aliases should have been a win because avoiding actual disk IO. But looping in Perl when you can loop in Perl internals is a lose, and modern caching OS filebuffers vanquish the redundant IO on most filesystems.)

Sep 02 '19 03:09 n1vux

I agree it should be nearly free. I'll try some benchmarks to prove to myself that it's the case.

Sep 03 '19 00:09 petdance