fio
fio copied to clipboard
fio takes long time to start processes
Hi,
I'm trying to run a test with >500 jobs and it takes more than 20 mins for fio to start doing IO. Attached a screenshot to show where it takes time. Is this normal when launching large number of processes or anything can be done to improve. Thanks.

uname -r
4.18.0-147.el8.ppc64le
cat /etc/redhat-release
Red Hat Enterprise Linux release 8.1 (Ootpa)
fio -v
fio-3.19
cat rand.fio
[global] name=randwrite ioengine=libaio iodepth=32 rw=randwrite randrepeat=0 bs=2Mi direct=1 ramp_time=0 runtime=600 time_based group_reporting
[job 1] filename=/dev/sdaee [job 2] filename=/dev/sdaes [job 3] filename=/dev/sdadf .... .... .... [job 523] filename=/dev/sdhx [job 524] filename=/dev/sdid
Try and add norandommap to the global section.
tried norandommap. fio is still taking >20 min to start doing IO.
'# time fio /tmp/rand.fio ..... real 22m32.359s user 4m52.282s sys 21m12.710s
fio runtime is only 60s.
'# cat /tmp/rand.fio [global] name=randwrite ioengine=libaio iodepth=32 rw=randwrite randrepeat=0 bs=2Mi direct=1 ramp_time=0 runtime=60 time_based group_reporting norandommap
[job 1] filename=/dev/sdaee [job 2] filename=/dev/sdaes [job 3] filename=/dev/sdadf .... .... .... [job 523] filename=/dev/sdhx [job 524] filename=/dev/sdid
You can try and do a:
# perf record -ag -- sleep 5
while it's starting up, and then do:
# perf report -g --no-children
and see what is going on in the system. If it's just the one busy fio thread, which it looks like, I'd fire up top and find the busy pid, then do:
# perf record -g -p <pid from above>
and then run the same perf report on that. How big are the sdXXX devices?
each device is 10T.
During startup: # perf record -ag -- sleep 5

top shows one busy fio thread (pid 61705)
gathered trace (10sec) for that pid. Looks like it's waiting to write to memory ? and acquire/release mutex?
# perf record -g -p 61705

For that last one, click the memset and mutex lock/unlock to get an expanded call trace, that'll give us a better idea of where it's happening.

OK, that makes sense, it's around the iostats setup. I'll try and take a look at this.
(Pinging @axboe on this one)
(@axboe ping)
Any update on this ? With 32 ns per drive on an NVMeOF setup and 8 drives, It takes about 4 minutes for fio to start even for numjobs=1. I'm running fio-3.28.
(@axboe ping)
Same for me, a 250 job test takes > 2 mins to start traffic with fio 3.28.
OK, that makes sense, it's around the iostats setup. I'll try and take a look at this.
@axboe Any update on this ?
OK, that makes sense, it's around the iostats setup. I'll try and take a look at this.
@axboe Any update on this ?
As a temporary work around, try running with --disk_util=0 if you don't need the disk utilization statistics.
@
OK, that makes sense, it's around the iostats setup. I'll try and take a look at this.
@axboe Any update on this ?
As a temporary work around, try running with
--disk_util=0if you don't need the disk utilization statistics.
Thanks for this @vincentkfu ! This appears to be working much more quickly even for large I/O.