benchmark-operator icon indicating copy to clipboard operation
benchmark-operator copied to clipboard

FIO with virtual machines not working

Open mlacko64 opened this issue 1 year ago • 1 comments

Describe the bug Hello, I was running FIO benchmark, working fine with pods, got results uploaded to my elastic instance.

When I tried VM benchmark using this example https://github.com/cloud-bulldozer/benchmark- operator/blob/master/config/samples/fio/vm-cr.yaml ( changed just elastic credentials and storage class details ), it has failed with error:

--------------------------- Ansible Task StdOut -------------------------------
 TASK [Create IP list and nodes] ********************************
fatal: [localhost]: FAILED! => {
    "msg": "The task includes an option with an undefined variable. The error was: 'dict object' has no attribute 'interfaces'. 'dict object' has no attribute 'interfaces'\n\nThe error appears to be in '/opt/ansible/roles/fio_distributed/tasks/main.yml': line 109, column 5, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n  - name: Create IP list and nodes\n    ^ here\n"
}

I was able to workaround it with adding this bit ugly (hardcoded 3) wait check into task collecting interface details, here: https://github.com/cloud-bulldozer/benchmark-operator/blob/2a4091f6d9e8315b2396db8165d160f993190089/roles/fio_distributed/tasks/main.yml#L93

  until: server_pods|json_query('resources[].status.interfaces[].ipAddress')|length == 3
  retries: 30
  delay: 30

Then benchmark run and finished as completed, but no data was sent to elastic. Actually I see in client and prefill pods below error and collected peft data were just zeroes:

2024-10-02T07:50:07Z - INFO     - MainProcess - py_es_bulk: Using streaming bulk indexer
2024-10-02T07:50:07Z - INFO     - MainProcess - wrapper_factory: identified fio as the benchmark wrapper
2024-10-02T07:50:07Z - INFO     - MainProcess - trigger_fio: Executing fio --client=/tmp/host/hosts /tmp/fiod-fd684f9d-6969-5658-949b-1d9624b684af/fiojob-write-64KiB-1/1/write/fiojob --output-format=json --output=/tmp/fiod-fd684f9d-6969-5658-949b-1d9624b684af/fiojob-write-64KiB-1/1/write/fio-result.json
2024-10-02T07:50:09Z - ERROR    - MainProcess - trigger_fio: Fio failed to execute
2024-10-02T07:50:09Z - ERROR    - MainProcess - trigger_fio: Output file: <fio-server-1-fd684f9d> fio: output file open error: No such file or directory
<fio-server-3-fd684f9d> fio: output file open error: No such file or directory
<fio-server-2-fd684f9d> fio: output file open error: No such file or directory
<fio-server-3-fd684f9d> fio: pid=8500, err=21/file:filesetup.c:805, func=open(/dev/xvda), error=Is a directory
<fio-server-2-fd684f9d> fio: pid=8503, err=21/file:filesetup.c:805, func=open(/dev/xvda), error=Is a directory
<fio-server-1-fd684f9d> fio: pid=8497, err=21/file:filesetup.c:805, func=open(/dev/xvda), error=Is a directory
client <10.130.4.96>: exited with error 1
client <10.129.4.9>: exited with error 1
client <10.131.4.131>: exited with error 1

So seems that there are some other issues I am not able to fix.

To Reproduce Steps to reproduce the behavior: Just try to run FIO VM benchmark using example CR.

Expected behavior FIO VM benchmarks collects benchmark data and sends them to elastic successfully.

Screenshots If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

  • OS [e.g. iOS]: OpenShift 4.15
  • Browser [e.g. chrome, safari]: N/A
  • Version [e.g. 22]: current master branch , installed using make deploy

Additional context Add any other context about the problem here.

mlacko64 avatar Oct 03 '24 08:10 mlacko64

Hey @mlacko64, https://github.com/cloud-bulldozer/benchmark-operator/pull/830 adds some potential fixes to the first issue you mentioned, can you verify it?

rsevilla87 avatar Oct 25 '24 15:10 rsevilla87

Hello @rsevilla87 , I have run FIO VM benchmark on my OpenShift 4.16 test cluster, all went fine, I have data in elastic. No errors in benchmark pods anymore. Thanks a lot for fixing this issue.

mlacko64 avatar Oct 28 '24 12:10 mlacko64

Hello @rsevilla87 , I have run FIO VM benchmark on my OpenShift 4.16 test cluster, all went fine, I have data in elastic. No errors in benchmark pods anymore. Thanks a lot for fixing this issue.

awesome, thanks for confirming!

rsevilla87 avatar Oct 28 '24 13:10 rsevilla87