Wrong Sequential Read Bandwidth
Please acknowledge the following before creating a ticket
- [x] I have read the GitHub issues section of REPORTING-BUGS.
Description of the bug: While checking FIO with x16 gen3 PCI card with 32GB of memory, it shows 55GB/s bandwidth for sequential read in cache-enabled system. This output is way more than PCI spec.
What are all check points we need to check for this issue? Is this a problem of FIO calculating bandwidth wrongly?
Environment: Ubuntu 18,
fio version: latest
Reproduction steps Execute fio in x16 gen3 pci card with some memory
This first thing that comes to mind is to make sure you are running with direct=1 so that you are measuring device performance without the influence of the linux page cache.
However, without more details of your command file and setup there is no way for anyone to really help you.
dev-dax.fio.txt Thank you for your input. Please find the dev-dax.fio file that I am using. I will check with direct=1
I could observe same physical address is being used for all read data. Is this expected behavior?
I wrongly assumed that you were testing some sort of SSD. I see that DAX devices do not support direct=1. I'm not able to help wtih DAX devices. Maybe folks at https://groups.google.com/g/pmem?pli=1 would be able to help you.
okay thank you
Hi,
Anyone else faced a similar issue with DAX memory. I was using 16 jobs.
While debugging I can see for all the read access it is mapping to same phy address. So I am assuming that there are many cache hit happening. Any method to resolve this kind of mapping
This first thing that comes to mind is to make sure you are running with
direct=1so that you are measuring device performance without the influence of the linux page cache. However, without more details of your command file and setup there is no way for anyone to really help you.
Hi, when I run with devdax engine, I could see every time for mapping memory dev_dax_prep_full() is called. how can I enable partial mapping functions?
I have enabled offset_increment=16g and size=16g and threads as 2, my device's total size is 32GB. This was the first thread will get 0-16 and the second thread 17-32GB. But from log I could understand both threads are mmaping at the same offset 0. this is because both threads is calling dev_dax_prep_full() https://github.com/axboe/fio/blob/master/engines/dev-dax.c +141