avocado-vt icon indicating copy to clipboard operation
avocado-vt copied to clipboard

booting more number of guests results in filedescriptor out of range in select error

Open bssrikanth opened this issue 2 years ago • 1 comments

I am trying to boot > 100 guests with tp-qemu boot test on RHEL server, but observing avocado-vt is failing to create qemu_monitor after ~67 guest boots with below trace. Appreciate any pointers on resolving this issue.

PS: On RHEL host I have taken care of increasing ulimit -n and below issue seems not to be related to limits on host as it points to select().

03:08:27 INFO | Connecting to monitor '<<class 'virttest.qemu_monitor.QMPMonitor'>> catch_monitor'
03:08:27 ERROR|
03:08:27 ERROR| Reproduced traceback from: /usr/local/lib/python3.9/site-packages/avocado_vt/test.py:274
03:08:27 ERROR| Traceback (most recent call last):
03:08:27 ERROR|   File "/usr/local/lib/python3.9/site-packages/virttest/error_context.py", line 135, in new_fn
03:08:27 ERROR|     return fn(*args, **kwargs)
03:08:27 ERROR|   File "/usr/local/lib/python3.9/site-packages/virttest/env_process.py", line 1391, in preprocess
03:08:27 ERROR|     process(test, params, env, preprocess_image, preprocess_vm,
03:08:27 ERROR|   File "/usr/local/lib/python3.9/site-packages/virttest/env_process.py", line 888, in process
03:08:27 ERROR|     _call_vm_func()
03:08:27 ERROR|   File "/usr/local/lib/python3.9/site-packages/virttest/env_process.py", line 800, in _call_vm_func
03:08:27 ERROR|     vm_func(test, vm_params, env, vm_name)
03:08:27 ERROR|   File "/usr/local/lib/python3.9/site-packages/virttest/env_process.py", line 299, in preprocess_vm
03:08:27 ERROR|     vm.create(name, params, test.bindir,
03:08:27 ERROR|   File "/usr/local/lib/python3.9/site-packages/virttest/error_context.py", line 135, in new_fn
03:08:27 ERROR|     return fn(*args, **kwargs)
03:08:27 ERROR|   File "/usr/local/lib/python3.9/site-packages/virttest/qemu_vm.py", line 3260, in create
03:08:27 ERROR|     monitor = qemu_monitor.wait_for_create_monitor(self,
03:08:27 ERROR|   File "/usr/local/lib/python3.9/site-packages/virttest/qemu_monitor.py", line 171, in wait_for_create_monitor
03:08:27 ERROR|     return create_monitor(vm, monitor_name, monitor_params)
03:08:27 ERROR|   File "/usr/local/lib/python3.9/site-packages/virttest/qemu_monitor.py", line 151, in create_monitor
03:08:27 ERROR|     monitor = MonitorClass(vm, monitor_name, monitor_params)
03:08:27 ERROR|   File "/usr/local/lib/python3.9/site-packages/virttest/qemu_monitor.py", line 1759, in __init__
03:08:27 ERROR|     for obj in self._read_objects():
03:08:27 ERROR|   File "/usr/local/lib/python3.9/site-packages/virttest/qemu_monitor.py", line 1802, in _read_objects
03:08:27 ERROR|     if not self._data_available():
03:08:27 ERROR|   File "/usr/local/lib/python3.9/site-packages/virttest/qemu_monitor.py", line 382, in _data_available
03:08:27 ERROR|     return bool(select.select([self._socket], [], [], timeout)[0])
03:08:27 ERROR| ValueError: filedescriptor out of range in select()

bssrikanth avatar Oct 27 '22 10:10 bssrikanth

Update: I am hitting this issue independent of test runner I use. As per my investigation this issue is due to usage of select.select(), which is not scalable, instead the recommendation which I understand is to use select.poll(). I currently see at many places avocado as well as avocado-vt using select.select() calls. Should current code be updated to use select.poll() ? appreciate any inputs. thank you.

bssrikanth avatar Oct 28 '22 06:10 bssrikanth