redpanda icon indicating copy to clipboard operation
redpanda copied to clipboard

self-test disk test enhancements

Open travisdowns opened this issue 1 year ago • 3 comments

rpk: add additional disk self tests

Add 16K block size disk tests, a common block size written by Redpanda, at varying IO depths: 1, 8 and 32 times the shard count (the multiplication by the shard count happens in Redpanda and is inevitable).

This will help better assess the performance of block storage which is a bit outside the usual, in particular how it response to io depth changes.

Additionally, add a 4K test which is the same as the existing one but with dsync off. This is critical to assess the impact of fdatasync on the storage layer: locally, for me on my consumer SSD this makes a 257x difference (!!) in throughput though the effect is much more muted, perhaps close to zero on other SSD types.

On the redpanda side, when we complete a self test the API returns info about the runincluding an info field which says "write run" currently (for a disk test). Enhance this to include information about whether dsync was enabled and the total io depth (which is the client-specified parallelism value times the number of shards).

Backports Required

  • [x] none - not a bug fix
  • [ ] none - this is a backport
  • [ ] none - issue does not exist in previous branches
  • [ ] none - papercut/not impactful enough to backport
  • [ ] v24.1.x
  • [ ] v23.3.x
  • [ ] v23.2.x

Release Notes

  • none

travisdowns avatar Jun 27 '24 20:06 travisdowns

This will slow down the test quite a bit but I guess that's not really a problem

If we want more data points and we want to keep the same duration per test, I don't really see an alternative to that. However, we could always reduce the default per-test duration if the overall current duration (2 minutes, at default per-test duration) is a "sweet spot" or somethign like that.

Note that most of these newly added tests have skipRead=true so they are half the time of the existing tests, so the time expansion is actually half of what you'd guess by looking at it. The increase is 4 tests -> 8 tests, so 2 minutes to 4 minutes at default duration.

travisdowns avatar Jun 28 '24 20:06 travisdowns

Stupid "close with comment" button sitting there looking so pressable.

travisdowns avatar Jun 28 '24 20:06 travisdowns

Updated in push: https://github.com/redpanda-data/redpanda/commit/0c1753babad028bbaf37a50cd116e464d03cc272

  • Removed the io_depth() method and stopped assuming the parallelism was multiplied by the shard count.
  • Changed the "io depth" sequence from 16K to 4K, except for the iodepth=1 test, only write test is done. Kept one 16K r/w test at 64 io depth. The no dsync test is at 4K, 64 io depth.
  • Removed ", dsync" from the description of the 512k r/w test since it doesn't make sense for the "read" part.
  • Fixed tests that said r/w when they were actually only write.
  • Aligned --help output with these changes.

Example output after this change:

NODE ID: 0 | STATUS: IDLE
=========================
NAME        512KB sequential r/w
INFO        write run (iodepth: 4, dsync: true)
TYPE        disk
TEST ID     931e192d-2133-4304-b093-3586d18b0c56
TIMEOUTS    0
DURATION    1009ms
IOPS        425 req/sec
THROUGHPUT  212.5MiB/sec
LATENCY     P50     P90      P99      P999     MAX
            9215us  11775us  14847us  21503us  21503us

NAME        512KB sequential r/w
INFO        read run
TYPE        disk
TEST ID     931e192d-2133-4304-b093-3586d18b0c56
TIMEOUTS    0
DURATION    1000ms
IOPS        10147 req/sec
THROUGHPUT  4.955GiB/sec
LATENCY     P50    P90    P99    P999    MAX
            247us  639us  799us  1087us  1215us

NAME        4KB sequential r/w, low io depth
INFO        write run (iodepth: 1, dsync: true)
TYPE        disk
TEST ID     931e192d-2133-4304-b093-3586d18b0c56
TIMEOUTS    0
DURATION    1002ms
IOPS        414 req/sec
THROUGHPUT  1.617MiB/sec
LATENCY     P50     P90     P99     P999    MAX
            2431us  2559us  2687us  5887us  5887us

NAME        4KB sequential r/w, low io depth
INFO        read run
TYPE        disk
TEST ID     931e192d-2133-4304-b093-3586d18b0c56
TIMEOUTS    0
DURATION    1000ms
IOPS        621714 req/sec
THROUGHPUT  2.372GiB/sec
LATENCY     P50   P90   P99   P999  MAX
            1us   1us   2us   23us  543us

NAME        4KB sequential write, medium io depth
INFO        write run (iodepth: 8, dsync: true)
TYPE        disk
TEST ID     931e192d-2133-4304-b093-3586d18b0c56
TIMEOUTS    0
DURATION    1014ms
IOPS        523 req/sec
THROUGHPUT  2.043MiB/sec
LATENCY     P50      P90      P99      P999     MAX
            15871us  16383us  20479us  20479us  21503us

NAME        4KB sequential write, high io depth
INFO        write run (iodepth: 64, dsync: true)
TYPE        disk
TEST ID     931e192d-2133-4304-b093-3586d18b0c56
TIMEOUTS    0
DURATION    1115ms
IOPS        607 req/sec
THROUGHPUT  2.371MiB/sec
LATENCY     P50       P90       P99       P999      MAX
            118783us  126975us  139263us  139263us  180223us

NAME      4KB sequential write, very high io depth
TYPE      disk
TEST ID   931e192d-2133-4304-b093-3586d18b0c56
TIMEOUTS  0
DURATION  0ms
ERROR     IO Queue depth (parallelism) out of range, min is 1, max 256

NAME        4KB sequential write, no dsync
INFO        write run (iodepth: 64, dsync: false)
TYPE        disk
TEST ID     931e192d-2133-4304-b093-3586d18b0c56
TIMEOUTS    0
DURATION    1000ms
IOPS        366771 req/sec
THROUGHPUT  1.399GiB/sec
LATENCY     P50    P90    P99    P999   MAX
            167us  231us  303us  735us  1151us

NAME        16KB sequential r/w, high io depth
INFO        write run (iodepth: 64, dsync: false)
TYPE        disk
TEST ID     931e192d-2133-4304-b093-3586d18b0c56
TIMEOUTS    0
DURATION    1000ms
IOPS        195040 req/sec
THROUGHPUT  2.976GiB/sec
LATENCY     P50    P90    P99    P999   MAX
            319us  367us  431us  479us  543us

NAME        16KB sequential r/w, high io depth
INFO        read run
TYPE        disk
TEST ID     931e192d-2133-4304-b093-3586d18b0c56
TIMEOUTS    0
DURATION    1000ms
IOPS        197272 req/sec
THROUGHPUT  3.01GiB/sec
LATENCY     P50    P90    P99    P999   MAX
            335us  367us  463us  639us  1023us

The help output:

Starts one or more benchmark tests on one or more nodes
of the cluster. Available tests to run:

* Disk tests:
  * Throughput test: 512 KB messages, sequential read/write
    * Uses a larger request message sizes and deeper I/O queue depth to write/read more bytes in a shorter amount of time, at the cost of IOPS/latency.
  * Latency and io depth tests: 4 KB messages, sequential read/write, varying io depth
    * Uses small IO sizes and varying levels of parallelism to determine the relationship between io depth and IOPS
        * Includes one test without using dsync (fdatasync) on each write to establish the cost of dsync
  * 16 KB test
    * One high io depth test at 16 KB to reflect performance at Redpanda's default chunk size

travisdowns avatar Jun 29 '24 03:06 travisdowns

/dt

travisdowns avatar Jul 01 '24 02:07 travisdowns

/dt

travisdowns avatar Jul 02 '24 02:07 travisdowns

/ci-repeat 1

travisdowns avatar Jul 02 '24 19:07 travisdowns

/ci-repeat 1

travisdowns avatar Jul 05 '24 03:07 travisdowns

skipped ducktape retry in https://buildkite.com/redpanda/redpanda/builds/51128#0190812c-f3ce-4e04-840f-426fdcd3fac9: pandatriage cache was not found

skipped ducktape retry in https://buildkite.com/redpanda/redpanda/builds/51128#0190812c-f3cf-4a1d-97e8-aec9c71db760: pandatriage cache was not found

skipped ducktape retry in https://buildkite.com/redpanda/redpanda/builds/51128#0190812c-f3d1-4683-8a5f-77831d2deecd: pandatriage cache was not found

skipped ducktape retry in https://buildkite.com/redpanda/redpanda/builds/51128#0190812c-f3cc-455c-b0d2-212dcdab44f1: pandatriage cache was not found

skipped ducktape retry in https://buildkite.com/redpanda/redpanda/builds/51128#0190812e-d95c-486b-9d4d-89b31bda8c5b: pandatriage cache was not found

skipped ducktape retry in https://buildkite.com/redpanda/redpanda/builds/51128#0190812e-d95e-4eb2-b6b1-7dd3881feba2: pandatriage cache was not found

skipped ducktape retry in https://buildkite.com/redpanda/redpanda/builds/51128#0190812e-d957-4a96-b015-473add4dc93b: pandatriage cache was not found

skipped ducktape retry in https://buildkite.com/redpanda/redpanda/builds/51128#0190812e-d959-46e9-b67c-0e9571b17b33: pandatriage cache was not found

skipped ducktape retry in https://buildkite.com/redpanda/redpanda/builds/51441#0190a8a4-3b07-4103-86be-2e71180e4479: pandatriage cache was not found

skipped ducktape retry in https://buildkite.com/redpanda/redpanda/builds/51441#0190a8a4-3b05-4dc7-9d4b-4878bd5eb84b: pandatriage cache was not found

skipped ducktape retry in https://buildkite.com/redpanda/redpanda/builds/51441#0190a8bc-745d-4058-9d93-e84bc9906f5d: pandatriage cache was not found

vbotbuildovich avatar Jul 05 '24 05:07 vbotbuildovich

OK these remaining errors seem legit, looking.

travisdowns avatar Jul 08 '24 15:07 travisdowns

https://github.com/redpanda-data/redpanda/commit/365233c588acc7de032b236eea01eb6d315f6a35 is a pure rebase.

https://github.com/redpanda-data/redpanda/pull/20590/commits/f01195aaed0ed0637aff5fdda06b78041a197810 changes the max iodepth in the new tests to 256 from 512, as RP has a hardcoded limit of 256 in the self test code. I also considered increasing this limit from 256 to 512 on the RP side but then we'd have issues running self-test in cases where the RPK version was newer than the Redpanda version, which is a supported and I think fairly common scenario, so I decided to change RPK instead.

This should fix the test failures.

travisdowns avatar Jul 11 '24 17:07 travisdowns

new failures in https://buildkite.com/redpanda/redpanda/builds/51367#0190a306-08a6-4d35-b300-695b00ac2af8:

"rptest.tests.self_test_test.SelfTestTest.test_self_test.remote_read=False.remote_write=False"

new failures in https://buildkite.com/redpanda/redpanda/builds/51367#0190a306-08a8-4562-a22d-e76610db099a:

"rptest.tests.self_test_test.SelfTestTest.test_self_test.remote_read=False.remote_write=True"
"rptest.tests.self_test_test.SelfTestTest.test_self_test_node_crash"

new failures in https://buildkite.com/redpanda/redpanda/builds/51367#0190a306-08aa-43eb-8b38-c266ab6f395f:

"rptest.tests.self_test_test.SelfTestTest.test_self_test.remote_read=True.remote_write=False"

new failures in https://buildkite.com/redpanda/redpanda/builds/51367#0190a306-08ac-4d5b-81cd-1ebe01054b59:

"rptest.tests.self_test_test.SelfTestTest.test_self_test.remote_read=True.remote_write=True"

new failures in https://buildkite.com/redpanda/redpanda/builds/51367#0190a307-bacc-47d3-ab43-fa4842e939fd:

"rptest.tests.self_test_test.SelfTestTest.test_self_test.remote_read=False.remote_write=True"
"rptest.tests.self_test_test.SelfTestTest.test_self_test_node_crash"

new failures in https://buildkite.com/redpanda/redpanda/builds/51367#0190a307-baca-43b9-b035-d624e18befca:

"rptest.tests.self_test_test.SelfTestTest.test_self_test.remote_read=False.remote_write=False"

new failures in https://buildkite.com/redpanda/redpanda/builds/51367#0190a307-bac8-4f3f-b1d7-0a027f3a1d46:

"rptest.tests.self_test_test.SelfTestTest.test_self_test.remote_read=True.remote_write=True"

new failures in https://buildkite.com/redpanda/redpanda/builds/51367#0190a307-bace-4ce7-865c-f95725cb07e4:

"rptest.tests.self_test_test.SelfTestTest.test_self_test.remote_read=True.remote_write=False"

vbotbuildovich avatar Jul 11 '24 19:07 vbotbuildovich

Hopefully this last push fixes all the failures. All the tests were passing for me locally but it turned out it just because of https://github.com/redpanda-data/vtools/pull/2950 not rebuilding my RPK.

travisdowns avatar Jul 12 '24 19:07 travisdowns

All spurious failures, retrying.

travisdowns avatar Jul 15 '24 14:07 travisdowns

ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/51441#0190b6b2-8b1b-47d6-ac1c-278a444551c1

ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/51532#0190b7dd-5b51-42bf-a583-11a55136488d

ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/51669#0190c1d0-6076-404d-a63a-326257de411d

ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/51730#0190c736-8f0e-4639-8b72-755d2feabab0

ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/51755#0190c8f2-4424-407b-80f1-45c68a49dd59

vbotbuildovich avatar Jul 15 '24 15:07 vbotbuildovich

/ci-repeat 1

travisdowns avatar Jul 15 '24 16:07 travisdowns

/ci-repeat 1

travisdowns avatar Jul 15 '24 18:07 travisdowns

Spurious GH download failure in last run.

travisdowns avatar Jul 15 '24 18:07 travisdowns

/ci-repeat 1

travisdowns avatar Jul 17 '24 16:07 travisdowns

Last failure was a merge conflict, fixed. Hopefully this CI run is the one.

travisdowns avatar Jul 18 '24 14:07 travisdowns

https://github.com/redpanda-data/redpanda/commit/bd8b94da1e89bff11aa845455270a9dd32508198 is to fix yet another merge conflict (what's up with my luck on this change?).

travisdowns avatar Jul 19 '24 01:07 travisdowns