alluxio
alluxio copied to clipboard
Add microbenchmarks for multiple implementations of BlockStore
What changes are proposed in this pull request?
Added a benchmark suite to compare the performance between MonoBlockStore
and PagedBlockStore
Why are the changes needed?
Provide insights on the micro performance characteristic for the PagedBlockStore
that is under development.
Does this PR introduce any user facing changes?
No
@dbw9580 Would you like to take a look at this? Thanks!

An experimental run on my laptop.
I wonder why monoBlockStoreReadLocal
is significantly slower than any other ones?
https://github.com/Alluxio/alluxio/blob/8f74e6584b6d2d71be3a3737057808b6f77c44ae/core/common/src/main/java/alluxio/worker/block/io/LocalFileBlockReader.java#L100
javadoc on java.nio.channels.FileChannel#map
says
For most operating systems, mapping a file into memory is more expensive than reading or writing a few tens of kilobytes of data via the usual read and write methods. From the standpoint of performance it is generally only worth mapping relatively large files into memory.
looks like memory-mapping a file has a significant overhead... even for a file as large as 64MB.
It's surprising that monoBlockStoreReadLocal performance is so bad...
https://lkml.indiana.edu/hypermail/linux/kernel/0802.0/1496.html There's been discussions about performance issues of mmap
. In general it seems that mmap
brings significant overhead with like additional page fault, and could easily end up slower than normal read
calls.

Experimental run with page size as parameters.
I take a closer look at the mmap performance on Linux.
Looks like there is quite a significant difference in perf between Linux and Mac.
My microbench is here https://gist.github.com/dbw9580/2bdb08ea6bf8e44e94a7f28c2d06155d
I take a closer look at the mmap performance on Linux.
Looks like there is quite a significant difference in perf between Linux and Mac.
My microbench is here https://gist.github.com/dbw9580/2bdb08ea6bf8e44e94a7f28c2d06155d
Yeah my running suggests memory mapping is a lot slower on MacOS. Also, the current UnderFileSytemBlockReader
uses a BufferedInputStream
under the hood rather than a RandomAccessFile
. That might be why readUfs
could still be faster than readLocal
.
Added an experimental run of Random Read results.
Random Read benchmark is conducted by:
- Parameters: BlockSize - size of the whole block; ReadSize - size of each read
- Randomly generate a sequence of offsets into the block, with invariant: NumOffsets * ReadSize = BlockSize (Each random read might overlap)
- For
StoreBlockReader
which is used forMonoBlockStore
's local read, simulate randomtransferTo
by positioning the rawFileChannel
directly. No such methods exist forUnderFileSystemBlockReader
andPagedBlockReader
, so onlyread
test for them. (Also the performance ofmonoBlockStoreRandTransferLocal
is dubiously high)

Great job! just post two flamecharts for paged read and transfer.
transfer: (we could see extra buf write)
read:
Benchmark results on Linux. Seems that MacOS does have particular issues with mmap
@dbw9580 Do you think this pr is good to merge now?
alluxio-bot, merge this please