neon icon indicating copy to clipboard operation
neon copied to clipboard

async walredo & remove `wal_redo_timeout`

Open problame opened this issue 1 year ago • 3 comments

Before this PR, the nix::poll::poll call would stall the executor.

While at it, remove the wal_redo_timeout. Rationale:

  1. It's completely untested (e.g., we don't even kill & restart the walredo process on restart (we should also review other error handling code paths for this))
  2. Given that we're now using async, we no longer need to worry about stalling the executor. Before, we had to, because otherwise a malicious walredo process would have otherwise brought down the executor thread completely.

TODO:

  • [ ] review for cancellation safety, request_redo promises it is cancel safe and the bench_walredo relies on it as of #6743

fixes #6628 refs https://github.com/neondatabase/neon/issues/2975

problame avatar Jan 31 '24 11:01 problame

2760 tests run: 2642 passed, 0 failed, 118 skipped (full report)


Code coverage* (full report)

  • functions: 28.0% (6437 of 23015 functions)
  • lines: 46.5% (45063 of 96823 lines)

* collected from Rust tests only


The comment gets automatically updated with the latest test results
cecc9bc8e7a8aeb5914bc3cf662711323827e1af at 2024-04-15T19:58:53.464Z :recycle:

github-actions[bot] avatar Jan 31 '24 13:01 github-actions[bot]

Good catch with the cancellation problem.

Pushed changes, please have another look.

problame avatar Jan 31 '24 16:01 problame

Joonas has found performance of this PR to be much worse, DM : https://neondb.slack.com/archives/D049K7HJ9JM/p1707009731704869

short/short/1           time:   [11.417 µs 11.635 µs 11.846 µs]
Found 18 outliers among 100 measurements (18.00%)
  4 (4.00%) low severe
  5 (5.00%) low mild
  9 (9.00%) high mild
short/short/2           time:   [33.945 µs 34.371 µs 34.795 µs]
Found 7 outliers among 100 measurements (7.00%)
  1 (1.00%) low severe
  4 (4.00%) low mild
  2 (2.00%) high mild
short/short/4           time:   [55.483 µs 55.901 µs 56.294 µs]
Found 11 outliers among 100 measurements (11.00%)
  5 (5.00%) low severe
  1 (1.00%) low mild
  5 (5.00%) high mild
short/short/8           time:   [126.11 µs 126.67 µs 127.22 µs]
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild
short/short/16          time:   [257.16 µs 258.27 µs 259.40 µs]
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

medium/medium/1         time:   [76.750 µs 78.309 µs 79.861 µs]
medium/medium/2         time:   [169.51 µs 170.02 µs 170.61 µs]
Found 8 outliers among 100 measurements (8.00%)
  1 (1.00%) low mild
  7 (7.00%) high severe
medium/medium/4         time:   [305.85 µs 308.94 µs 312.44 µs]
Found 10 outliers among 100 measurements (10.00%)
  8 (8.00%) high mild
  2 (2.00%) high severe
medium/medium/8         time:   [555.67 µs 560.14 µs 564.97 µs]
Found 6 outliers among 100 measurements (6.00%)
  4 (4.00%) high mild
  2 (2.00%) high severe
medium/medium/16        time:   [1.1525 ms 1.1644 ms 1.1773 ms]
Found 9 outliers among 100 measurements (9.00%)
  7 (7.00%) high mild
  2 (2.00%) high severe

these are very much inline with what I remember, but yeah perhaps the single thread version used to be below 10us, maybe 7us. my shmempipe got to 2-3us for the short, which I think is the limit I guess. that was with probably way too much spinning :) (edited)

problame avatar Feb 05 '24 11:02 problame

I pushed https://github.com/neondatabase/neon/pull/6548/commits/cca66e5e82ba8657a03e1a05d38c180fd5c3d7e7 plus https://github.com/neondatabase/neon/pull/6548/commits/67a7abc7cf4a377bbe0d3283a498f6851538b1f3 which make the walredo process implementation configurable.

There's too much code duplication for my taste, but, it was the easiest way to have the implementations side by side.

The runtime-override is a nice bonus & makes benchmarking simpler, so, I would want to keep it until we commit to clearly one vs the other walredo variant.

Then again, I'm not sure whether we actually need the side-by-side implementations. My recent benchmarking suggests that async walredo is better at throughput and marginally worse at tail latencies.

So, I'll possibly revert these two commits.

problame avatar Apr 03 '24 19:04 problame

@koivunej the PR changed significantly since your last review, re-requesting it. I think we have established context in various meetings / calls in the last couple of weeks.

problame avatar Apr 12 '24 20:04 problame

Addressed your review comments, see latest batch of pushes.

I wonder if you explicitly reviewed the diff of process_sync.rs vs process_async.rs.

I just did, and noticed one forgotten difference that might impact the performance evaluation: the process_async.rs has #[instrument(skip_all, level = tracing::Level::DEBUG), the process_sync.rs doesn't.

I'll do a quick test with bench_walredo.rs to see whether there's a major difference, but, I won't do a full evaluation run again.

problame avatar Apr 15 '24 13:04 problame

Ack, going through the diff now.

koivunej avatar Apr 15 '24 13:04 koivunej