rapiddisk
rapiddisk copied to clipboard
rapiddisk vs tmpfs: file read/write speed
The reason I found rapiddisk is I am looking for a ramdisk faster than tmpfs/ramfs. Benchmarking rapiddisk rd0 device shows great results (above 10GB/s throughput) both read and write but once I put file system on it and start reading writing files I get much worse performance (around 2GB/s) which is even worse than tmpfs. I suspect this is because files become subject to buffering/caching, plus file system overhead. Is there a combination of factors like file system selection, tunables, system settings, etc that would help reach speeds nearer 10GB/s? The same hardware on windows ramdisk I get file read write speeds above 10GB/s without any tuning.
Ps I am relatively new to linux so am probably missing something simple... thanks.
What file system are you placing on the ram drive and what performance utility are you using (with parameters) to benchmark?
Tried ext4, ext3, ext2 and xfs. dd if=./testfile of=/dev/null bs=1M count=8000 dd if=/dev/zero of=./testfile bs=1M count=8000 dd if=/dev/rd0 of=/dev/null bs=1M Also "disks" utility built in benchmark (default settings)
Ps I made sure current directory was a ramdisk fs path where I porvided ./testfile
So, when benchmarking a block device or a filesystem, dd is typically not the utility to use. When it writes/reads to/from a target, it is a single threaded / single stream synchronous I/O operation. Throw a file system into the picture and then it further complicates things (one write equals multiple because you are also creating/updating metadata potentially across multiple locations on the file system). A good utility to use for benchmarking (one that I rely on quite heavily) is the industry standard fio (or Flexible I/O).
Other notes, good I/O benchmarking practice also involves writing a lot of small (i.e. 4K) random asynchronous I/Os (sometimes across multiple parallel jobs) to determine both throughput and IOPS and to test the actual device (and not the system buffer cache), it is also advised to execute the I/O with Direct I/O enabled.
For example, here is a a random write to a RapidDisk drive (on a slower system I have in my office currently):
petros@dev-machine:~$ sudo fio --bs=4k --ioengine=libaio --iodepth=32 --size=10m --direct=1 --runtime=60 --filename=/dev/rd0 --rw=randwrite --name=test --numjobs=8 --group_reporting
test: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=32
...
fio-3.16
Starting 8 processes
[ ... ]
Run status group 0 (all jobs):
WRITE: bw=1600MiB/s (1678MB/s), 1600MiB/s-1600MiB/s (1678MB/s-1678MB/s), io=80.0MiB (83.9MB), run=50-50msec
Disk stats (read/write):
rd0: ios=0/0, merge=0/0, ticks=0/0, in_queue=0, util=0.00%
We are seeing about 1.6 GB/s.
Here is an example of a random read from the same raw block device:
petros@dev-machine:~$ sudo fio --bs=4k --ioengine=libaio --iodepth=32 --size=10m --direct=1 --runtime=60 --filename=/dev/rd0 --rw=randread --name=test --numjobs=8 --group_reporting
test: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=32
...
fio-3.16
Starting 8 processes
[ ... ]
Run status group 0 (all jobs):
READ: bw=3810MiB/s (3995MB/s), 3810MiB/s-3810MiB/s (3995MB/s-3995MB/s), io=80.0MiB (83.9MB), run=21-21msec
Disk stats (read/write):
rd0: ios=0/0, merge=0/0, ticks=0/0, in_queue=0, util=0.00%
We see 3.8 GB/s.
Now I formatted the block device with Ext4 and mounted it locally. The same random write to a file now:
petros@dev-machine:~$ sudo fio --bs=4k --ioengine=libaio --iodepth=32 --size=10m --direct=1 --runtime=60 --filename=/mnt/rd0/test.dat --rw=randwrite --name=test --numjobs=8 --group_reporting
test: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=32
...
fio-3.16
Starting 8 processes
[ ... ]
Run status group 0 (all jobs):
WRITE: bw=1111MiB/s (1165MB/s), 1111MiB/s-1111MiB/s (1165MB/s-1165MB/s), io=80.0MiB (83.9MB), run=72-72msec
Disk stats (read/write):
rd0: ios=0/0, merge=0/0, ticks=0/0, in_queue=0, util=0.00%
Yes, 1.1 GB/s but there is a bit of file system overhead happening here.
The random read from the same file:
petros@dev-machine:~$ sudo fio --bs=4k --ioengine=libaio --iodepth=32 --size=10m --direct=1 --runtime=60 --filename=/mnt/rd0/test.dat --rw=randread --name=test --numjobs=8 --group_reporting
test: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=32
...
fio-3.16
Starting 8 processes
[ ... ]
Run status group 0 (all jobs):
READ: bw=3077MiB/s (3226MB/s), 3077MiB/s-3077MiB/s (3226MB/s-3226MB/s), io=80.0MiB (83.9MB), run=26-26msec
Disk stats (read/write):
rd0: ios=0/0, merge=0/0, ticks=0/0, in_queue=0, util=0.00%
We see 3.1 GB/s, again, file system overhead.
@andriusst Hello. Any updates or thoughts on my latest reply? Anyway, have a wonderful weekend.
Hi and thank you for the time you spent doing the benchmark to give the results. My simple test was not a great way to get an accurate speed test result but it echoed my observations on the real life multi-threaded application I run. So that is why I was looking for suggestions how I can speed things up with rapiddisk. The application is madmax chia-plotter. It generates a large data set and is I/O intensive and multi-threaded so that's why I am using a ramdisk. But still I think it is I/O bound on my system as the CPU does not get fully utilized. So I am looking for a very fast storage, hence my interest in rapiddisk to make it run faster than using ramfs. If rapiddisk is designed to have certain features rather than absolutely fastest speed then I have no problem with that. I just want to make sure that is really the case and not me being linux noob not knowing how to tune things for speed.
I will try fio out of interest and will post the results so that there is something to compare. But my goal here is not to make benchmark show higher results but to make the application run faster.
@andriusst Hi there, maybe you can try mixing write around and writeback modes, using rapiddisk
. I am far away from your speed needs, but this combination made a linux installation placed on an not-so-recent mechanichal drive a lot faster - changed from unusable to quite decent, especially after a short while using the desktop thanks to the WA cache. I can't produce numbers now, but you can give it a try if you can.
Regards
@matteotenca Hi and thank you for ideas. So write-around and write-back modes, do they relate to physical disk caching rather than ramdisk component, correct? I will give it a try to see if caching the disk will be faster than working entirely on a ramdisk. Although linux has file system caching anyway, I found increasing vm_dirty_background_ratio and vm.dirty_ratio speeds up the writes, especially when the entire disk workload can fit in the memory. But in my case I think the limitation I am hitting is page cache bandwidth. tmpfs and ramfs live in the pagecache so are subject to the same bandwidth limit but the rapiddisk ramdisk is not. So I hope rapiddisk must have untapped speed potential and I am looking how to unlock it.
Just wanted to share some test results. All the same parameters, except the -size and -direct. With -size=10m test completes very quickly and results are not consistent between several runs so I increase -size to 1000m or even 10000m this produces very repeatable results.
ramfs: READ: bw=16.6GiB/s WRITE: bw=2473MiB/s (direct=0) /dev/rd0: READ: bw=17.5GiB/s WRITE: bw=12.0GiB/s (direct=1) rd0,ext4 READ: bw=9889MiB/s WRITE (too inconsistent, 350-9500MiB/s) (direct=1) rd0,ext4 READ: bw=8583MiB/s WRITE: bw=1462MiB/s (direct=0)
rd0 with ext4 test with -size=1000m produces around 9500MiB/s but -size=10000m gives only about 350MiB. Very strange anomaly. In any case raw rapiddisk device is faster than ramfs. It is a shame it is not very useful for my application without the file system.
@andriusst Thank you for sharing these numbers. Can I ask what the general fio command looked like?
@pkoutoupis
sudo fio --bs=4k --ioengine=libaio --iodepth=32 --size=1000m --direct=1 --runtime=60 --filename=/mnt/rd/test.tmp --rw=randwrite --name=test --numjobs=8 --group_reporting
Hmm. Interesting. You are in fact writing twice, to "two separate regions on disk (albeit in RAM)" for a single write request because Ext4 is a disk file system. Although, I did not expect the performance to be half. In my testing shared above, it was less but only by 30%. I wonder though, if it will help to adjust certain mount options, like the commit interval (default 5 seconds) to something much larger and disabling atime/diratime updates. An example: -o noatime,nodiratime,commit=60
Or if you rerun the above fio tests using Ext2 instead (and the noatime,nodiratime options) since you clearly do not need a journal.
EDIT: the noatime,nodiratime
options technically help more with the reads than the writes (as it relates to metadata operations).
sure, my pleasure. It did not like the commit=60 option so mounted only with -o noatime,nodiratime rd0,ext2 READ: bw=7401MiB/s WRITE: bw=6832MiB/s (direct=1) rd0,ext2 READ: bw=8879MiB/s WRITE: bw=1845MiB/s (direct=0)
Yes, the commit option will only work on ext3/4. Thank you for running those tests. What baffles me most are the reads. Your directio writes look great (while the buffered are 25% less). But the reads are half which I do not understand. I am not entirely sure what your application will look like. Is it read or write intensive or just a good mix of both?
the application is 50%/50%. Most of the time there are 8 threads running concurrently and each thread does reads and writes fairly sequentially. I tried it with 16xRAID0 (native BTRFS raid) 15K rpm disks and seen total I/O peak at 2.5GB/s. Now that I have enough RAM to fit the entire workload in ramfs it completes the cycle about 20% faster. So if we can extrapolate disk I/O from that it would probably peak at around 3.0GB/s which is mehh speed considering the raw speed your rapiddisk can do. I am running Lubuntu and the hardware is 256GB RAM on quad channel Threadripper 3960X. Hope that reveals some possible reasons.
@andriusst Were we able to get the answers we were looking for and if so, can we close this issue? Thank you.
Thank you and everyone else for ideas and suggestions. I given up on looking for answers and abandoned this approach.