swift-nio
swift-nio copied to clipboard
Fix `SchedulingBenchmark` preheating logic.
Motivation:
At the moment SchedulingBenchmark preheats single EL to a specified number of tasks. However, the actual performance test runs number of times, and EL doesn't get drained in between the runs.
There are two problems with it:
- perf test will trigger task heap doubleings, which the preheating aims to avoid
- each test run will be working with proportionally deeper heap, which means the run times are going to grow with each run
Modifications:
I plumbed through the # of runs to the Benchmark.setUp, and prepare ELG with # of ELs that match the expected number of runs.
Result:
Performance test will be more reliable.
[...] runs number of times, and EL doesn't get drained in between the runs.
@gmilos that's the actual issue here. It should be completely drained after each run.
[...] Modifications:
I plumbed through the # of runs to the
Benchmark.setUp, and prepare ELG with # of ELs that match the expected number of runs.
That doesn't sound ideal as if it's actually the case that the others aren't drained, then this will now cause high CPU load and expects us to have #runs CPUs available etc.
@swift-nio-bot test perf please
performance report
build id: 156
timestamp: Wed Jan 3 13:45:50 UTC 2024
results
| name | min | max | mean | std |
| write_http_headers | 0.042907723 | 0.043165247 | 0.042973653 | 9.70556125379205e-05 |
| http_headers_canonical_form | 0.10455533 | 0.10730633 | 0.10511068750000001 | 0.000809335616920894 |
| http_headers_canonical_form_trimming_whitespace | 0.020678702 | 0.021188336 | 0.0207580329 | 0.00015479947182040933 |
| http_headers_canonical_form_trimming_whitespace_from_short_string | 0.018708244 | 0.01923846 | 0.0187952384 | 0.00015906405380027787 |
| http_headers_canonical_form_trimming_whitespace_from_long_string | 0.030301067 | 0.030804568 | 0.030385388800000003 | 0.00015088716639322867 |
| bytebuffer_write_12MB_short_string_literals | 0.143270983 | 0.14943897 | 0.1441855301 | 0.0018536697851552317 |
| bytebuffer_write_12MB_short_calculated_strings | 0.067587874 | 0.069486342 | 0.0687012671 | 0.0005389776086500247 |
| bytebuffer_write_12MB_medium_string_literals | 0.938363651 | 0.97485219 | 0.9508204281999999 | 0.013250425598645685 |
| bytebuffer_write_12MB_medium_calculated_strings | 0.086556923 | 0.089021016 | 0.0870135612 | 0.000731880891976859 |
| bytebuffer_write_12MB_large_calculated_strings | 0.163417139 | 0.164472042 | 0.1641449972 | 0.0003404484955280193 |
| bytebuffer_lots_of_rw | 0.044265314 | 0.044929763 | 0.044431870299999995 | 0.00023316710060290754 |
| bytebuffer_write_http_response_ascii_only_as_string | 0.029828004 | 0.030381939 | 0.0299376602 | 0.00016310420389622758 |
| bytebuffer_write_http_response_ascii_only_as_staticstring | 0.029231652 | 0.029859072 | 0.0294445389 | 0.00017518514336073792 |
| bytebuffer_write_http_response_some_nonascii_as_string | 0.028767805 | 0.029312969 | 0.0288888285 | 0.00021134165086169015 |
| bytebuffer_write_http_response_some_nonascii_as_staticstring | 0.028939677 | 0.030695064 | 0.029339388700000003 | 0.0005196977050649629 |
| no-net_http1_1k_reqs_1_conn | 0.011615747 | 0.012100875 | 0.0117296609 | 0.00013522579434613994 |
| http1_1k_reqs_1_conn | 0.060492661 | 0.061901803 | 0.0612277168 | 0.0004381693782502116 |
| http1_1k_reqs_100_conns | 0.090465821 | 0.090860192 | 0.0906602483 | 0.00011466376957197908 |
| future_whenallsucceed_100k_immediately_succeeded_off_loop | 0.080549118 | 0.082637506 | 0.081468365 | 0.0007871206764443573 |
| future_whenallsucceed_100k_immediately_succeeded_on_loop | 0.080940765 | 0.088269588 | 0.0824305449 | 0.002134804165856356 |
| future_whenallsucceed_10k_deferred_off_loop | 0.023354389 | 0.023773316 | 0.023462138 | 0.0001324403603740188 |
| future_whenallsucceed_10k_deferred_on_loop | 0.014468765 | 0.014600609 | 0.0145289954 | 4.971332905815584e-05 |
| future_whenallcomplete_100k_immediately_succeeded_off_loop | 0.040924739 | 0.041610577 | 0.041152195600000004 | 0.00023931122711101126 |
| future_whenallcomplete_100k_immediately_succeeded_on_loop | 0.041419036 | 0.041985228 | 0.0416712287 | 0.0001902419231779779 |
| future_whenallcomplete_10k_deferred_off_loop | 0.016106523 | 0.017959537 | 0.0167431273 | 0.0006612598717008233 |
| future_whenallcomplete_100k_deferred_on_loop | 0.084619949 | 0.087602916 | 0.08551536779999999 | 0.0008921542183957266 |
| future_reduce_10k_futures | 0.017307059 | 0.017845536 | 0.0175047041 | 0.00015765937100206426 |
| future_reduce_into_10k_futures | 0.015271552 | 0.015405853 | 0.0153215517 | 4.1640959098918226e-05 |
| channel_pipeline_1m_events | 0.099658043 | 0.099798237 | 0.09974338660000001 | 4.948307737039316e-05 |
| websocket_encode_50b_space_at_front_100k_frames_cow | 0.049749614 | 0.050189388 | 0.0498925438 | 0.00020159322906883591 |
| websocket_encode_50b_space_at_front_1m_frames_cow_masking | 0.657249144 | 0.660868877 | 0.6581938381000001 | 0.001124422093879371 |
| websocket_encode_1kb_space_at_front_1m_frames_cow | 0.526078369 | 0.526747823 | 0.5263413171 | 0.00018762750014942448 |
| websocket_encode_50b_no_space_at_front_100k_frames_cow | 0.050102496 | 0.05058825 | 0.0502541197 | 0.0002134421036789604 |
| websocket_encode_1kb_no_space_at_front_100k_frames_cow | 0.052487978 | 0.052930183 | 0.052637525500000004 | 0.0001995756449003469 |
| websocket_encode_50b_space_at_front_100k_frames | 0.073903666 | 0.074350969 | 0.0741008752 | 0.00020326936093480533 |
| websocket_encode_50b_space_at_front_10k_frames_masking | 0.00889408 | 0.008927912 | 0.008907087599999999 | 9.962841542451471e-06 |
| websocket_encode_1kb_space_at_front_10k_frames | 0.012442082 | 0.012877802 | 0.0125207999 | 0.00013005579675017288 |
| websocket_encode_50b_no_space_at_front_100k_frames | 0.071874749 | 0.072902787 | 0.0723683473 | 0.0003529541819519128 |
| websocket_encode_1kb_no_space_at_front_10k_frames | 0.011704753 | 0.011798532 | 0.011730439300000001 | 3.002173715104898e-05 |
| websocket_decode_125b_10k_frames | 0.012596187 | 0.013051204 | 0.012710906 | 0.0001339040473863448 |
| websocket_decode_125b_with_a_masking_key_10k_frames | 0.013026622 | 0.01625301 | 0.0137532196 | 0.0011449994208533418 |
| websocket_decode_64kb_10k_frames | 0.012870337 | 0.013384654 | 0.0130012155 | 0.00014205029683355097 |
| websocket_decode_64kb_with_a_masking_key_10k_frames | 0.013328684 | 0.013495727 | 0.0134062059 | 5.629989460509189e-05 |
| websocket_decode_64kb_+1_10k_frames | 0.012897385 | 0.016614399 | 0.013328305499999998 | 0.0011557375071808039 |
| websocket_decode_64kb_+1_with_a_masking_key_10k_frames | 0.013289503 | 0.013809198 | 0.0134011819 | 0.00014745835970232413 |
| circular_buffer_into_byte_buffer_1kb | 0.033002613 | 0.033536173 | 0.0331520546 | 0.00018518534877924055 |
| circular_buffer_into_byte_buffer_1mb | 0.064661982 | 0.065130012 | 0.0648244472 | 0.00020092638953595694 |
| byte_buffer_view_iterator_1mb | 0.01756013 | 0.018081564 | 0.0176232754 | 0.00016123574967605693 |
| byte_buffer_view_contains_12mb | 0.052910349 | 0.053560145 | 0.0531561787 | 0.00021187141882076117 |
| byte_to_message_decoder_decode_many_small | 0.041325565 | 0.041860639 | 0.0415019664 | 0.00023618235080039308 |
| generate_10k_random_request_keys | 0.091185533 | 0.091505915 | 0.09138744169999999 | 0.00010962978709684585 |
| bytebuffer_rw_10_uint32s | 0.04080077 | 0.041416125 | 0.0409719066 | 0.00021478161702229284 |
| bytebuffer_multi_rw_10_uint32s | 0.074633317 | 0.075221512 | 0.0748953122 | 0.00024639593300070653 |
| lock_1_thread_10M_ops | 0.151529459 | 0.152741502 | 0.1520439887 | 0.0003694800419021466 |
| lock_2_threads_10M_ops | 0.786501782 | 0.909838773 | 0.8525907718999999 | 0.03231466284401043 |
| lock_4_threads_10M_ops | 0.937752797 | 0.959532161 | 0.9473351966999999 | 0.007973824451028328 |
| lock_8_threads_10M_ops | 0.957658591 | 0.987632844 | 0.9778233794 | 0.008966224334225099 |
| schedule_100k_tasks | 0.063766844 | 0.105724529 | 0.07314636890000001 | 0.012951859063592227 |
| schedule_and_run_100k_tasks | 0.252803345 | 0.267169133 | 0.2608068538 | 0.004253384282073607 |
| execute_100k_tasks | 0.103045814 | 0.105475272 | 0.1042747825 | 0.0009121376774441264 |
| bytebufferview_copy_to_array_100k_times_1kb | 0.010984296 | 0.011033014 | 0.0109959675 | 1.4597997512977405e-05 |
| circularbuffer_copy_to_array_10k_times_1kb | 0.019746973 | 0.020199469 | 0.019804835 | 0.00013886138398657403 |
| deadline_now_1M_times | 0.024568465 | 0.024832095 | 0.0246682263 | 9.23638256450479e-05 |
| asyncwriter_single_writes_1M_times | 1.464787299 | 1.467467645 | 1.4662272632 | 0.0008228612221267796 |
| asyncsequenceproducer_consume_1M_times | 0.907417083 | 0.910416828 | 0.9089906522 | 0.0010595941502413693 |
| udp_10k_writes | 0.37901331 | 0.379875118 | 0.3793720076 | 0.0002815189041030453 |
| udp_10k_vector_writes | 0.205883308 | 0.206418052 | 0.20622898890000002 | 0.00016904891221474557 |
| udp_10k_vector_reads | 0.386684625 | 0.387768161 | 0.3872836356 | 0.00033373534307398853 |
| udp_10k_vector_reads_and_writes | 0.109082179 | 0.109593621 | 0.1093604517 | 0.00017203582618684093 |
| tcp_100k_messages_throughput | 0.75330207 | 0.787823236 | 0.7734669324000001 | 0.010813256483347567 |
comparison
| name | current | previous | winner | diff |
| write_http_headers | 0.042907723 | 0.042886202 | previous | 0% |
| http_headers_canonical_form | 0.10455533 | 0.106193642 | current | -1% |
| http_headers_canonical_form_trimming_whitespace | 0.020678702 | 0.021160017 | current | -2% |
| http_headers_canonical_form_trimming_whitespace_from_short_string | 0.018708244 | 0.019237102 | current | -2% |
| http_headers_canonical_form_trimming_whitespace_from_long_string | 0.030301067 | 0.031139957 | current | -2% |
| bytebuffer_write_12MB_short_string_literals | 0.143270983 | 0.143459794 | current | 0% |
| bytebuffer_write_12MB_short_calculated_strings | 0.067587874 | 0.07066772 | current | -4% |
| bytebuffer_write_12MB_medium_string_literals | 0.938363651 | 0.94105786 | current | 0% |
| bytebuffer_write_12MB_medium_calculated_strings | 0.086556923 | 0.08698647 | current | 0% |
| bytebuffer_write_12MB_large_calculated_strings | 0.163417139 | 0.165702724 | current | -1% |
| bytebuffer_lots_of_rw | 0.044265314 | 0.043246136 | previous | 2% |
| bytebuffer_write_http_response_ascii_only_as_string | 0.029828004 | 0.028208719 | previous | 5% |
| bytebuffer_write_http_response_ascii_only_as_staticstring | 0.029231652 | 0.028714732 | previous | 1% |
| bytebuffer_write_http_response_some_nonascii_as_string | 0.028767805 | 0.027803065 | previous | 3% |
| bytebuffer_write_http_response_some_nonascii_as_staticstring | 0.028939677 | 0.028839596 | previous | 0% |
| no-net_http1_1k_reqs_1_conn | 0.011615747 | 0.011778778 | current | -1% |
| http1_1k_reqs_1_conn | 0.060492661 | 0.061404357 | current | -1% |
| http1_1k_reqs_100_conns | 0.090465821 | 0.09061921 | current | 0% |
| future_whenallsucceed_100k_immediately_succeeded_off_loop | 0.080549118 | 0.080259785 | previous | 0% |
| future_whenallsucceed_100k_immediately_succeeded_on_loop | 0.080940765 | 0.079877066 | previous | 1% |
| future_whenallsucceed_10k_deferred_off_loop | 0.023354389 | 0.023212502 | previous | 0% |
| future_whenallsucceed_10k_deferred_on_loop | 0.014468765 | 0.014316848 | previous | 1% |
| future_whenallcomplete_100k_immediately_succeeded_off_loop | 0.040924739 | 0.040145402 | previous | 1% |
| future_whenallcomplete_100k_immediately_succeeded_on_loop | 0.041419036 | 0.0405237 | previous | 2% |
| future_whenallcomplete_10k_deferred_off_loop | 0.016106523 | 0.015676013 | previous | 2% |
| future_whenallcomplete_100k_deferred_on_loop | 0.084619949 | 0.08085791 | previous | 4% |
| future_reduce_10k_futures | 0.017307059 | 0.016911554 | previous | 2% |
| future_reduce_into_10k_futures | 0.015271552 | 0.014511281 | previous | 5% |
| channel_pipeline_1m_events | 0.099658043 | 0.101659459 | current | -1% |
| websocket_encode_50b_space_at_front_100k_frames_cow | 0.049749614 | 0.049812283 | current | 0% |
| websocket_encode_50b_space_at_front_1m_frames_cow_masking | 0.657249144 | 0.668089258 | current | -1% |
| websocket_encode_1kb_space_at_front_1m_frames_cow | 0.526078369 | 0.523242559 | previous | 0% |
| websocket_encode_50b_no_space_at_front_100k_frames_cow | 0.050102496 | 0.04962388 | previous | 0% |
| websocket_encode_1kb_no_space_at_front_100k_frames_cow | 0.052487978 | 0.052218856 | previous | 0% |
| websocket_encode_50b_space_at_front_100k_frames | 0.073903666 | 0.072742069 | previous | 1% |
| websocket_encode_50b_space_at_front_10k_frames_masking | 0.00889408 | 0.008845607 | previous | 0% |
| websocket_encode_1kb_space_at_front_10k_frames | 0.012442082 | 0.012337981 | previous | 0% |
| websocket_encode_50b_no_space_at_front_100k_frames | 0.071874749 | 0.07207833 | current | 0% |
| websocket_encode_1kb_no_space_at_front_10k_frames | 0.011704753 | 0.011690726 | previous | 0% |
| websocket_decode_125b_10k_frames | 0.012596187 | 0.012334842 | previous | 2% |
| websocket_decode_125b_with_a_masking_key_10k_frames | 0.013026622 | 0.01274516 | previous | 2% |
| websocket_decode_64kb_10k_frames | 0.012870337 | 0.012671642 | previous | 1% |
| websocket_decode_64kb_with_a_masking_key_10k_frames | 0.013328684 | 0.013136916 | previous | 1% |
| websocket_decode_64kb_+1_10k_frames | 0.012897385 | 0.012642493 | previous | 2% |
| websocket_decode_64kb_+1_with_a_masking_key_10k_frames | 0.013289503 | 0.013195296 | previous | 0% |
| circular_buffer_into_byte_buffer_1kb | 0.033002613 | 0.033011484 | current | 0% |
| circular_buffer_into_byte_buffer_1mb | 0.064661982 | 0.06466184 | previous | 0% |
| byte_buffer_view_iterator_1mb | 0.01756013 | 0.017563643 | current | 0% |
| byte_buffer_view_contains_12mb | 0.052910349 | 0.052952322 | current | 0% |
| byte_to_message_decoder_decode_many_small | 0.041325565 | 0.041571445 | current | 0% |
| generate_10k_random_request_keys | 0.091185533 | 0.090277131 | previous | 1% |
| bytebuffer_rw_10_uint32s | 0.04080077 | 0.041266035 | current | -1% |
| bytebuffer_multi_rw_10_uint32s | 0.074633317 | 0.072410584 | previous | 3% |
| lock_1_thread_10M_ops | 0.151529459 | 0.15131291 | previous | 0% |
| lock_2_threads_10M_ops | 0.786501782 | 0.820194284 | current | -4% |
| lock_4_threads_10M_ops | 0.937752797 | 0.87456686 | previous | 7% |
| lock_8_threads_10M_ops | 0.957658591 | 0.873560162 | previous | 9% |
| schedule_100k_tasks | 0.063766844 | 0.062177021 | previous | 2% |
| schedule_and_run_100k_tasks | 0.252803345 | 0.233980813 | previous | 8% |
| execute_100k_tasks | 0.103045814 | 0.099815383 | previous | 3% |
| bytebufferview_copy_to_array_100k_times_1kb | 0.010984296 | 0.010981564 | previous | 0% |
| circularbuffer_copy_to_array_10k_times_1kb | 0.019746973 | 0.019756913 | current | 0% |
| deadline_now_1M_times | 0.024568465 | 0.024640981 | current | 0% |
| asyncwriter_single_writes_1M_times | 1.464787299 | 1.596832264 | current | -8% |
| asyncsequenceproducer_consume_1M_times | 0.907417083 | 0.885448468 | previous | 2% |
| udp_10k_writes | 0.37901331 | 0.375730776 | previous | 0% |
| udp_10k_vector_writes | 0.205883308 | 0.204086694 | previous | 0% |
| udp_10k_vector_reads | 0.386684625 | 0.38397455 | previous | 0% |
| udp_10k_vector_reads_and_writes | 0.109082179 | 0.10824488 | previous | 0% |
| tcp_100k_messages_throughput | 0.75330207 | 0.778933674 | current | -3% |
significant differences found
@swift-nio-bot perf test please
@weissi re https://github.com/apple/swift-nio/pull/2650#issuecomment-1943515397
That doesn't sound ideal as if it's actually the case that the others aren't drained, then this will now cause high CPU load and expects us to have #runs CPUs available etc.
No, because the tasks never run. They are just scheduled for some future date (that never arrives during the test run). So the tasks scheduled in the past runs are effectively dormant.