redpanda icon indicating copy to clipboard operation
redpanda copied to clipboard

bytes: replacement bytes implementation for libc++18

Open dotnwat opened this issue 1 year ago • 2 comments

libc++ >= v18 have deprecated (to be removed in v19) char_traits<T> for T other than char (and some other types, like wchar). our bytes implementation is uses T=uint8_t and because seastar::sstring interoperates with std::string/std::string_view, we encounter the deprecation.

this PR introduces a new bytes implementation that wraps a seastar::sstring, and casts back and forth between pointers as needed.

Backports Required

  • [x] none - not a bug fix
  • [ ] none - this is a backport
  • [ ] none - issue does not exist in previous branches
  • [ ] none - papercut/not impactful enough to backport
  • [ ] v24.2.x
  • [ ] v24.1.x
  • [ ] v23.3.x

Release Notes

  • none

dotnwat avatar Aug 27 '24 04:08 dotnwat

~I will probably switch this over to use abseil inlinedvector after examining it i think we can achieve a similar uninitialized allocation optimization.~

dotnwat avatar Aug 27 '24 19:08 dotnwat

ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/53648#01919687-a202-4ac3-8e9b-332abfcecaef

ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/54467#0191ecb4-4b10-4abb-95f9-108199563778

ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/54467#0191edb6-6c7f-4622-ae75-732df0996cf2

ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/54499#0191f739-8cf9-417c-8997-287c31d17996

ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/54905#01921b57-4157-4d81-9700-0e4b9b9330bb

vbotbuildovich avatar Aug 28 '24 02:08 vbotbuildovich

@travisdowns @StephanDollberg @rockwotj in the interest of avoiding the UB concern entirely (but like travis mentioned, perhaps its UB that we are ok with), i added a new commit that implements bytes in terms of absl::inlined_vector, and created a benchmark for a few cases:

  1. initialized_later
  2. zero initialization
  3. append

bytes.*: abseil version sstring.*: sstring version

as expected, initialized_later is faster with sstring because abseil interface doesn't offer the option to skip initialization (in this benchmark it is zero initialization).

roughly, up to 128K sizes, we'd pay ~microsecond of overhead. presumably we could also go hunting around the code base for usages of bytes() that are in a hot path and change how the bytes type is used if they show up in a profile?

14: test                              iterations      median         mad         min         max      allocs       tasks        inst
14: bytes.initialized_later_0          459500000     1.373ns     0.000ns     1.373ns     1.375ns       0.000       0.000        17.3
14: sstring.initialized_later_0        503104000     1.189ns     0.000ns     1.187ns     1.190ns       0.000       0.000        10.3
14: bytes.initialized_later_10         290141000     2.604ns     0.001ns     2.603ns     2.604ns       0.000       0.000        36.3
14: sstring.initialized_later_10       489875000     1.206ns     0.002ns     1.205ns     1.208ns       0.000       0.000        10.3
14: bytes.initialized_later_100        106797000     8.209ns     0.000ns     8.208ns     8.210ns       1.000       0.000       143.3
14: sstring.initialized_later_100      118622000     7.262ns     0.002ns     7.257ns     7.266ns       1.000       0.000       123.3
14: bytes.initialized_later_1000        74471000     9.398ns     0.013ns     9.385ns     9.428ns       1.000       0.000       172.3
14: sstring.initialized_later_1000      89729000     7.098ns     0.001ns     7.097ns     7.100ns       1.000       0.000       123.3
14: bytes.initialized_later_10000       10988000    60.754ns     0.007ns    60.747ns    60.777ns       1.000       0.000       424.3
14: sstring.initialized_later_10000     26059000     7.074ns     0.002ns     7.072ns     7.077ns       1.000       0.000       123.3
14: bytes.initialized_later_100000       1095000   615.564ns     0.075ns   615.475ns   618.526ns       1.000       0.000      3389.3
14: sstring.initialized_later_100000     3005000    28.151ns     0.082ns    28.070ns    28.300ns       1.000       0.000       630.3
14: bytes.initialized_zero_0           463474000     1.361ns     0.000ns     1.361ns     1.362ns       0.000       0.000        17.3
14: sstring.initialized_zero_0         170729000     5.054ns     0.000ns     5.053ns     5.056ns       0.000       0.000        29.3
14: bytes.initialized_zero_10          306899000     2.421ns     0.000ns     2.420ns     2.421ns       0.000       0.000        36.3
14: sstring.initialized_zero_10        169572000     5.056ns     0.001ns     5.055ns     5.059ns       0.000       0.000        29.3
14: bytes.initialized_zero_100         102871000     8.575ns     0.000ns     8.572ns     8.576ns       1.000       0.000       143.3
14: sstring.initialized_zero_100       104785000     8.371ns     0.004ns     8.364ns     8.375ns       1.000       0.000       138.3
14: bytes.initialized_zero_1000         74622000     9.385ns     0.001ns     9.383ns     9.387ns       1.000       0.000       172.3
14: sstring.initialized_zero_1000       74918000     9.383ns     0.001ns     9.374ns     9.388ns       1.000       0.000       167.3
14: bytes.initialized_zero_10000        10982000    60.763ns     0.003ns    60.756ns    60.772ns       1.000       0.000       424.3
14: sstring.initialized_zero_10000      10971000    60.759ns     0.002ns    60.748ns    60.766ns       1.000       0.000       419.3
14: bytes.initialized_zero_100000        1095000   615.436ns     0.028ns   615.382ns   615.481ns       1.000       0.000      3389.3
14: sstring.initialized_zero_100000      1095000   616.402ns     0.088ns   616.314ns   616.564ns       1.000       0.000      3383.3
14: bytes.append_0                     276549000     2.812ns     0.000ns     2.812ns     2.814ns       0.000       0.000        57.3
14: sstring.append_0                    83581000    11.155ns     0.000ns    11.155ns    11.166ns       0.000       0.000        87.3
14: bytes.append_10                     94539000     9.884ns     0.001ns     9.879ns     9.885ns       0.000       0.000       127.3
14: sstring.append_10                   71173000    13.316ns     0.001ns    13.310ns    13.316ns       0.000       0.000       121.3
14: bytes.append_100                    48183000    19.572ns     0.007ns    19.564ns    19.585ns       2.000       0.000       410.3
14: sstring.append_100                  49985000    18.837ns     0.002ns    18.830ns    18.842ns       2.000       0.000       336.3
14: bytes.append_1000                   18448000    51.085ns     0.064ns    50.851ns    51.148ns       2.000       0.000       724.3
14: sstring.append_1000                 21044000    43.598ns     0.051ns    43.546ns    43.729ns       2.000       0.000       492.3
14: bytes.append_10000                   2714000   337.452ns     0.048ns   337.404ns   337.749ns       2.000       0.000      3898.3
14: sstring.append_10000                 3532000   253.486ns     0.004ns   253.457ns   253.490ns       2.000       0.000      1868.3
14: bytes.append_100000                   261000     3.533us     0.165ns     3.533us     3.533us       2.000       0.000     33147.3
14: sstring.append_100000                 316000     2.874us     0.108ns     2.873us     2.874us       2.000       0.000     13213.3
14: Test Exit code 0
1/1 Test #14: bytes_bench_rpbench ..............   Passed  216.32 sec

dotnwat avatar Sep 15 '24 18:09 dotnwat

As a side note, it looks like std::string now (as of C++23) has the ability to be created but uninitialized with resize_and_overwrite. It's been around in libc++ for a while.

rockwotj avatar Sep 16 '24 00:09 rockwotj

As a side note, it looks like std::string now (as of C++23) has the ability to be created but uninitialized with resize_and_overwrite. It's been around in libc++ for a while.

yeh. that interface was rejected for std::vector and i think also std::inplace_vector for some reason.

dotnwat avatar Sep 16 '24 00:09 dotnwat

Yeah bypassing default constructors and such can be tricky. It's easier to explain with raw bytes.

rockwotj avatar Sep 16 '24 00:09 rockwotj

/microbench

StephanDollberg avatar Sep 16 '24 08:09 StephanDollberg

Instruction and alloc count diffs from other microbenches:

Performance changes detected in 9 tests
storage_rpbench_reducer_bench.compaction_key_reducer_test: inst -> +2.39%
heartbeat_bench_rpbench_fixture.test_old_hb_reply: inst -> +11.18%
heartbeat_bench_rpbench_fixture.test_old_hb_reply: allocs -> +0.00%
heartbeat_bench_rpbench_fixture.test_old_hb_request: inst -> +11.10%
crypto_bench_rpbench_openssl_perf_test.md5_1k: inst -> +0.14%
crypto_bench_rpbench_openssl_perf_test.sha256_1k: inst -> +0.04%
crypto_bench_rpbench_openssl_perf_test.sha512_1k: inst -> +0.06%
crypto_bench_fips_rpbench_openssl_perf_test.md5_1k: inst -> +0.14%
crypto_bench_fips_rpbench_openssl_perf_test.sha256_1k: inst -> +0.04%
crypto_bench_fips_rpbench_openssl_perf_test.sha512_1k: inst -> +0.06%

StephanDollberg avatar Sep 16 '24 12:09 StephanDollberg

Instruction and alloc count diffs from other microbenches:

that seems unsurprising with abseil's inlined vector. what's the threshold for concern?

dotnwat avatar Sep 16 '24 15:09 dotnwat

There is no general policy, will always have to go on a case by case basis.

This looks fine to me as well given the only major change is in the old heartbeats which shouldn't have any major usage anymore.

StephanDollberg avatar Sep 16 '24 15:09 StephanDollberg

@rockwotj wrote:

As a side note, it looks like std::string now (as of C++23) has the ability to be created but uninitialized with resize_and_overwrite. It's been around in libc++ for a while.

Well it doesn't really allow you to do the "created but uninitialized" thing, I don't think:

If any of the following conditions is satisfied, the behavior is undefined:

...

  • Any character in range [p, p + r) has an indeterminate value.

So it's saying it's UB to (for example), pass in an op which simply returns count but does not write the corresponding chars, which is how you'd emulate allocate-but-not-init.

Still, even if you adhere to this rule this interface can replace some of the reasons you want uninit storage in the first place.

Of course, breaking this rule seems quite unlikely to be punished in practice.

travisdowns avatar Sep 16 '24 19:09 travisdowns

Overall the code LGTM, just one thing that the new struct is 8 bytes bigger and IDK if that's OK.

yeh, also noticed this. i don't think it matters, and i'm not sure where the existing bytes_inline_size comes from. it was probably an educated guess.

tbh i'd like to teach iobuf SSO and get rid of bytes type entirely.

dotnwat avatar Sep 20 '24 18:09 dotnwat

thanks for the review @rockwotj. i have a merge conflict, and a few things to cleanup. should be able to get this posted again next week.

dotnwat avatar Sep 20 '24 18:09 dotnwat

I agree that the 8 bytes increase is probably OK.

travisdowns avatar Sep 21 '24 12:09 travisdowns

⚠️ The sha of the head commit of this PR conflicts with #23422. Mergify cannot evaluate rules on this PR. ⚠️

mergify[bot] avatar Sep 22 '24 18:09 mergify[bot]