bigmemory icon indicating copy to clipboard operation
bigmemory copied to clipboard

cache effect of reading on-disk matrix

Open mikejiang opened this issue 6 years ago • 2 comments

I am trying to compare different on-disk matrix formats and choose the final solution of the best performance of partial read IO. Here is some benchmark results, http://rpubs.com/wjiang2/399331 which shows bigmemory is very promising.

I'd also like to eliminate the linux page cache effect between the iterations of the same IO. Here is the command I used to clear the cache (I do have sudo permission and the command does succeed)

sync; echo 1 > /proc/sys/vm/drop_caches

However I see no difference after second iteration even after execute this command before each read.That is, the initial first read is slowest, and second , third read are always of the same speed and much faster than the first one. I wonder if there is some other caching mechanism going on besides the OS page cache in bigmemory package? @raphg

mikejiang avatar Jun 29 '18 18:06 mikejiang

Thanks for the link. This is very interesting.

bigmemory doesn't do any other caching. According to this you may need to echo 1, 2, and 3.

Are you on a virtual machine? The only other thing I can think of is that the host machine isn't actually dropping the pages.

kaneplusplus avatar Jul 09 '18 16:07 kaneplusplus

No, I am on a physical linux box. I've tried all 3 and they all showed the similar outcome.

mikejiang avatar Jul 09 '18 22:07 mikejiang