neon
neon copied to clipboard
async walredo & remove `wal_redo_timeout`
Before this PR, the nix::poll::poll
call would stall the executor.
While at it, remove the wal_redo_timeout
. Rationale:
- It's completely untested (e.g., we don't even kill & restart the walredo process on restart (we should also review other error handling code paths for this))
- Given that we're now using
async
, we no longer need to worry about stalling the executor. Before, we had to, because otherwise a malicious walredo process would have otherwise brought down the executor thread completely.
TODO:
- [ ] review for cancellation safety,
request_redo
promises it is cancel safe and thebench_walredo
relies on it as of #6743
fixes #6628 refs https://github.com/neondatabase/neon/issues/2975
2760 tests run: 2642 passed, 0 failed, 118 skipped (full report)
Code coverage* (full report)
-
functions
:28.0% (6437 of 23015 functions)
-
lines
:46.5% (45063 of 96823 lines)
* collected from Rust tests only
cecc9bc8e7a8aeb5914bc3cf662711323827e1af at 2024-04-15T19:58:53.464Z :recycle:
Good catch with the cancellation problem.
Pushed changes, please have another look.
Joonas has found performance of this PR to be much worse, DM : https://neondb.slack.com/archives/D049K7HJ9JM/p1707009731704869
short/short/1 time: [11.417 µs 11.635 µs 11.846 µs]
Found 18 outliers among 100 measurements (18.00%)
4 (4.00%) low severe
5 (5.00%) low mild
9 (9.00%) high mild
short/short/2 time: [33.945 µs 34.371 µs 34.795 µs]
Found 7 outliers among 100 measurements (7.00%)
1 (1.00%) low severe
4 (4.00%) low mild
2 (2.00%) high mild
short/short/4 time: [55.483 µs 55.901 µs 56.294 µs]
Found 11 outliers among 100 measurements (11.00%)
5 (5.00%) low severe
1 (1.00%) low mild
5 (5.00%) high mild
short/short/8 time: [126.11 µs 126.67 µs 127.22 µs]
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high mild
short/short/16 time: [257.16 µs 258.27 µs 259.40 µs]
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high mild
medium/medium/1 time: [76.750 µs 78.309 µs 79.861 µs]
medium/medium/2 time: [169.51 µs 170.02 µs 170.61 µs]
Found 8 outliers among 100 measurements (8.00%)
1 (1.00%) low mild
7 (7.00%) high severe
medium/medium/4 time: [305.85 µs 308.94 µs 312.44 µs]
Found 10 outliers among 100 measurements (10.00%)
8 (8.00%) high mild
2 (2.00%) high severe
medium/medium/8 time: [555.67 µs 560.14 µs 564.97 µs]
Found 6 outliers among 100 measurements (6.00%)
4 (4.00%) high mild
2 (2.00%) high severe
medium/medium/16 time: [1.1525 ms 1.1644 ms 1.1773 ms]
Found 9 outliers among 100 measurements (9.00%)
7 (7.00%) high mild
2 (2.00%) high severe
these are very much inline with what I remember, but yeah perhaps the single thread version used to be below 10us, maybe 7us. my shmempipe got to 2-3us for the short, which I think is the limit I guess. that was with probably way too much spinning :) (edited)
I pushed https://github.com/neondatabase/neon/pull/6548/commits/cca66e5e82ba8657a03e1a05d38c180fd5c3d7e7 plus https://github.com/neondatabase/neon/pull/6548/commits/67a7abc7cf4a377bbe0d3283a498f6851538b1f3 which make the walredo process implementation configurable.
There's too much code duplication for my taste, but, it was the easiest way to have the implementations side by side.
The runtime-override is a nice bonus & makes benchmarking simpler, so, I would want to keep it until we commit to clearly one vs the other walredo variant.
Then again, I'm not sure whether we actually need the side-by-side implementations. My recent benchmarking suggests that async walredo is better at throughput and marginally worse at tail latencies.
So, I'll possibly revert these two commits.
@koivunej the PR changed significantly since your last review, re-requesting it. I think we have established context in various meetings / calls in the last couple of weeks.
Addressed your review comments, see latest batch of pushes.
I wonder if you explicitly reviewed the diff of process_sync.rs
vs process_async.rs
.
I just did, and noticed one forgotten difference that might impact the performance evaluation: the process_async.rs
has #[instrument(skip_all, level = tracing::Level::DEBUG)
, the process_sync.rs
doesn't.
I'll do a quick test with bench_walredo.rs
to see whether there's a major difference, but, I won't do a full evaluation run again.
Ack, going through the diff now.