seastar
seastar copied to clipboard
io_tester: implement request_type::unlink
This change introduces request_type::unlink as well as unlink_class_data to io_tester. It also extends the creation of operations to be executed to recognize the new request type.
The purpose of this change is to enable io_tester to analyze the impact of unlink operations on read and write operations.
unlink_class_data creates a given number of files during startup. When requests are issued it calls unlink on the created files.
Refs: scylladb#1299
This change is only a draft. It needs to be properly tested. Also, further description of unlink_class_data
will be added to the commit message.
Hi @xemul, the PR has been adjusted and manually tested. It introduces request_type::unlink
as well as unlink_class_data
. The new job type works as follows:
- It receives
files_count
parameter via configuration and creates the given number of files during startup phase, before the evaluation starts. Each created file is filled with dummy data and its size is equal to_config.file_size / files_count
to match the disk's memory usage specified bydata_size
. The files are created in parallel - the number of running parallel instances is limited to avoid hitting open file descriptors limit that would cause an exception. - During the evaluation it calls
seastar::remove_file()
for each call tounlink_class_data::issue_request()
as long as there is an available file to be removed. When all files are removed it returns immediately. - Because
unlink_class_data
is derived fromclass_data
, therps
andthink_time
can be specified to alter the frequency of calls to unlink.
Aside from that this change adjusts the documentation of io_tester as well as an example configuration to cover the new changes. It also removes the obsolete information from the docs.
The code was tested manually. A few cases were covered:
- Firstly, the creation of files was tested. To verify that it is correct, the code of
unlink_class_data::issue_request()
was altered to skip the removal of files. This way, after the execution I was able to inspect that files were created and that their content is valid. - Secondly, the basic
request_type::unlink
was tested. The configuration did not cover anyrps
orthink_time
. The job correctly removed the files viaunlink_class_data::issue_request()
. - Thirdly,
rps
andthink_time
were tested. When I specified each of them separately, I was able to see the difference in latency reported byunlink_class_data
. Additionally, when I specified short duration of test, huge count of files and long think_time I was able to see that the test was not able to remove all files during the specified short duration. This proves that the number of executed requests was limited - without that setup the test was able to remove all files.
Can you also put here some measurement results, e.g. -- read workload run on its own vs read workload run in parallel with unlink one to see how unlinking affects read latency
Hi @xemul, please find the adjustments introduced by the new patch-set:
- The common part related to file creation has been extracted from
io_data_class
andunlink_data_class
to the new free function calledcreate_and_fill_file()
. It returns file handle and last write position. - A warning is now printed by
unlink_class_data::issue_request()
when all files have been removed. It means, that the request cannot be fulfilled. - More fields are now printed via
unlink_data_class::emit_results()
including IOPS, average/max latencies and total number of requests. - A new member function called
stop_hook()
has been added toclass_data
. By default it is empty, but the derived classes may override it. Such new member function is used to inject a cleanup routine, that is specific for the derived class. - Added
stop_hook()
implementation forunlink_data_class
to remove files, if any of them remain, whenkeep_files == false
. This way the tested directory is clean when the execution finishes.
Can you also put here some measurement results, e.g. -- read workload run on its own vs read workload run in parallel with unlink one to see how unlinking affects read latency
Hi @xemul, please find some measurements.
Environment property | Value |
---|---|
Machine | i3.xlarge |
OS | Ubuntu 22.04 |
Test duration | 20s |
SMP | 4 |
Compared configurations:
(1) randread | (2) randread+unlink_v1 | (3) randread+unlink_v2 | |
---|---|---|---|
read data_size | 20GB | 20GB | 20GB |
read reqsize | 512 | 512 | 512 |
read parallelism | 1 | 1 | 1 |
read shares | 100 | 100 | 100 |
unlink file size | N/A | 1MB | 1MB |
unlink think_time | N/A | 100us | 100us |
unlink parallelism | N/A | 5 | 10 |
Results (shard0):
(1) randread | (2) randread+unlink_v1 | (3) randread+unlink_v2 | |
---|---|---|---|
throughput [kB/s] | 4543.79492 | 2739.72925 | 2730.54761 |
IOPS | 9087.58984 | 5479.50879 | 5461.09521 |
avg latency | 108us | 179us | 179us |
p0.5 | 105us | 122us | 119us |
p0.95 | 130us | 436us | 519us |
p0.99 | 146us | 761us | 914us |
p0.999 | 181us | 1430us | 1713us |
max latency | 4314us | 37647us | 29873us |
total_requests | 181752 | 109591 | 109222 |
io_queue_total_exec_sec | 18.872 | 15.855 | 15.990 |
io_queue_total_delay_sec | 0.516 | 2.536 | 2.296 |
io_queue_total_operations | 181753 | 109592 | 109223 |
io_queue_starvation_time_sec | 0.504 | 2.486 | 2.248 |
io_queue_consumption | 0.022 | 0.0132 | 0.0130 |
io_queue_adjusted_consumption | 0.0008 | 0.0058 | 0.0056 |
The workloads that use also unlink operations show that the latency is higher. p0.95
is ~4 times higher, p0.99
is ~8 times higher.