test.py doesn't imediatelly close on SIGINT (ctrl+c)
^C
Shutdown requested... Aborting tests:
...done
...done
test_maintenance_socket test_raft_service_levels test_auth_no_quorum test_auth_raft_command_split test_auth_v2_migration logalloc_test ...done
database_test.test_safety_after_truncate ...done
database_test.test_querying_with_limits database_test.test_database_with_data_in_sstables_is_a_mutation_source_plain_basic_cg1 database_test.test_database_with_data_in_sstables_is_a_mutation_source_plain_basic_cg0 database_test.test_database_with_data_in_sstables_is_a_mutation_source_plain_fragments_monotonic_cg1 database_test.test_database_with_data_in_sstables_is_a_mutation_source_plain_reader_conversion_cg1 database_test.test_database_with_data_in_sstables_is_a_mutation_source_plain_read_back_cg1 database_test.test_truncate_without_snapshot_during_writes database_test.test_database_with_data_in_sstables_is_a_mutation_source_reverse_basic_cg0 database_test.test_database_with_data_in_sstables_is_a_mutation_source_reverse_reader_conversion_cg0 database_test.test_database_with_data_in_sstables_is_a_mutation_source_reverse_fragments_monotonic_cg0 database_test.test_database_with_data_in_sstables_is_a_mutation_source_reverse_read_back_cg0 database_test.test_database_with_data_in_sstables_is_a_mutation_source_plain_fragments_monotonic_cg0 database_test.test_database_with_data_in_sstables_is_a_mutation_source_plain_reader_conversion_cg0 database_test.test_database_with_data_in_sstables_is_a_mutation_source_plain_read_back_cg0 ...done
database_test.test_database_with_data_in_sstables_is_a_mutation_source_reverse_basic_cg1 ...done
database_test.snapshot_list_inexistent database_test.test_database_with_data_in_sstables_is_a_mut
(...)
flush_queue_test.test_queue_ordering_multi_ops ...done
^C^C^C^C^C^C^C^C^C
flush_queue_test.test_propagate_exception_in_op flush_queue_test.test_propagate_exception_in_post flush_queue_test.test_no_propagate_exception_in_op flush_queue_test.test_no_propagate_exception_in_post fragmented_temporary_buffer_test.test_read_to fragmented_temporary_buffer_test.test_empty_istream fragmented_temporary_buffer_test.test_read_view fragmented_temporary_buffer_test.test_view fragmented_temporary_buffer_test.test_view_equality fragmented_temporary_buffer_test.test_read_pod fragmented_temporary_buffer_test.test_read_bytes_view fragmented_temporary_buffer_test.test_skip fragmented_temporary_buffer_test.test_remove_suffix fragmented_temporary_buffer_test.test_read_fragmented_buffer ...done
^C^C^C^C
looks like some work continue and Shutdown requested... Aborting tests is not fully working.
In my opinion, it should be assumed that SIGINT (and also SIGTERM) is for interactive developer use, and should do interactive users expect to happen - the tests should stop immediately (or nearly immediately). It shouldn't do much more than print the names of tests that were killed. It shouldn't bother to cleanly "shut down" tests - it should just kill them all with SIGKILL.
By the way, this isn't relevant to what test.py does (I don't know what test.py does), but I have to admit that cql/run.py also doesn't do exactly what I suggested above.. It tries to kill with SIGTERM up to 10 seconds before falling back to SIGKILL. It used to use SIGKILL, but then in d2ca600eec2769222a6ab16bad6ca74dd06faf2e it was changed to this SIGTERM-with-timeout behavior. While it made sense for ordinary shutdown, I admit it isn't a great idea for control-C. Especially when a test hangs because of a Scylla bug, and you want to interrupt it, and Scylla can't shut down cleanly because it's hung.
It's a regression, it used to work.
@nuivall please provide the information about how you are launching them.
Via test.py
Just saw the same problem. Ran test.py (I tried both from "ninja dev-test" and directly), pressed control-C, saw
Shutdown requested... Aborting tests:
But then started to get an endless list of test names ...done, with a few seconds between each one.
I thought to myself maybe, at worst, it will still wait for a few last tests before exiting, but it really looked like it was going through all the tests and never finishing.
I expect control-C to work immediately, or at worst in a few seconds. I should not wait for many tests to finish gracefully, and most certainly should continue to start all the tests in the system.
@xtrey it hurts developers, could you find some time to find the root cause?
issue fixed in https://github.com/scylladb/scylladb/pull/22069