fio icon indicating copy to clipboard operation
fio copied to clipboard

stat optimization at initialization stage

Open struschev opened this issue 7 months ago • 3 comments

📝 Use Case Description

Hello!

I’m using fio to benchmark network filesystems (e.g., NFS, CIFS) with huge filesets (e.g., 1,600,000 files). While the benchmarking itself runs fine, I’m seeing that the initialization stage takes much longer—often 10x longer than the actual test run.

🔍 Profiling Details

A quick profiling session indicates that most of the time is spent in get_file_type(), which calls stat() for each file to determine the file type:

Image

💡 Current Workaround

For my specific case (large, regular file sets), I patched the code by hardcoding the file type:

int add_file(struct thread_data *td, const char *fname, int numjob, int inc)
{
    ...
    // get_file_type(f);
    f->filetype = FIO_TYPE_FILE;
    ...
}

This drastically speeds up the initialization phase for my use case.

🙋‍♂️ Question

Would it make sense to add a parameter (e.g., --assume-regular-files) that tells fio to skip the file type check (get_file_type) and assume all files are regular?

This could help users working with large regular file sets on network filesystems avoid unnecessary stat() calls.

struschev avatar May 29 '25 15:05 struschev

Hi @struschev,

Phew that's a whole lot of files to have on a network filesystem and I can see how the stat overhead is painful, so yes we would love to see a patch that added an option to help that workload. Some thoughts below:

  • If you choose to submit a patch don't forget to follow https://github.com/axboe/fio/blob/master/.github/PULL_REQUEST_TEMPLATE.md (although you can put <> around your email address ;)
  • Could you make it a option that takes a string like --file-type=file or --file-type=block etc. to match the types that someone may wish to specify to save doing a stat()? See https://github.com/axboe/fio/blob/fio-3.40/options.c#L2720-L2762 for an example of a option that is a choice. If the user uses your option to force an incorrect type then I say they get to keep both pieces when things break...
  • We would also need documentation for it in the HOWTO and man page perhaps in the "Target file/device" section?
  • I don't know what to do about the fact that multiple files can be specified "at the same time" (see https://fio.readthedocs.io/en/latest/fio_doc.html#cmdoption-arg-filename and the colon seperator). I don't think we can stuff any more into the filename and people still need to be able to set something when they don't specify a filename and are using options like nrfiles so another option seems like a good balance.

What do you think?

sitsofe avatar May 29 '25 18:05 sitsofe

  • Ok, I will prepare a patch and other stuff.
  • Although I doubt that anyone will need to test such a multitude of block/char devices, it doesn't cost me anything to make a more flexible option.
  • I propose just ignoring the new option when the filename is specified.

struschev avatar Jun 02 '25 07:06 struschev

I propose just ignoring the new option when the filename is specified.

Counter-proposal - your new option acts on all files of the job regardless of how they were defined. If the user doesn't use filename all the template name files are impacted by your option. If the user sets filename, all the files specified within in it are impacted by your option.

I guess it opens the question of how do you to handle conflicts? For example:

name=first
filetype=file
filename=mynetworkfsfile
name=second
filename=mynetworkfsfile

Do we stat on mynetworkfsfile because the second job didn't bother to set a filetype?

I've just been looking through fio options use underscores rather than hyphens or they just run the words together so counter to my previous suggestion I'd recommend the new option be called file_type or filetype.

sitsofe avatar Jun 02 '25 21:06 sitsofe

Hello, @sitsofe ! I've finally found the time to create a PR, and now I'm waiting for your feedback. Thanks

struschev avatar Jul 21 '25 14:07 struschev

@struschev I'm pleased to see the PR - thanks for taking the time to put it together. I've left some review comments.

sitsofe avatar Jul 21 '25 15:07 sitsofe