envoy subset lb: added multiple keys or multiple selectors support for single host per subset mode

Signed-off-by: wbpcode [email protected]

Commit Message: subset lb: added multiple keys or multiple selectors support for single host per subset mode Additional Description:

Subset lb is great and powerful lb. But in the previous implementation of subset lb, the single host per subset mode is handled separately and can only be used for configurations that contain single selector and single key. This PR does some optimizations to make the single host mode can be used in more scenes.

Added multiple keys or multiple selectors support for single host per subset mode.
Use almost completely same codes for single host mode and normal mode to make the code simpler and cleaner.
Evaluate all hosts for every update to make sure that we can add hosts to different subsets correctly in all cases (metadata only update, hosts added/removed, hybrid update).
Eliminate unnecessary metadata match (In the previous implementation, all the hosts will match against to metadata for every subset update and an update of endpoint set will result in all the subsets being updated.)

Risk Level: Mid. Core lb update. Testing: Unit tests ware added. Docs Changes: n/a. Release Notes: Added. Platform Specific Features: n/a.

Aug 10 '22 15:08 wbpcode

CC @envoyproxy/api-shepherds: Your approval is needed for changes made to (api/envoy/|docs/root/api-docs/). envoyproxy/api-shepherds assignee is @adisuissa CC @envoyproxy/api-watchers: FYI only for changes made to (api/envoy/|docs/root/api-docs/).

:cat:

Caused by: https://github.com/envoyproxy/envoy/pull/22644 was opened by wbpcode.

see: more, trace.

Aug 10 '22 15:08 repokitteh-read-only[bot]

Benchmark result after this PR:

-----------------------------------------------------------------------------------
Benchmark                                         Time             CPU   Iterations
-----------------------------------------------------------------------------------
benchmarkSubsetLoadBalancerCreate/0/50         1.88 ms         1.88 ms          352
benchmarkSubsetLoadBalancerCreate/1/50         1.38 ms         1.38 ms          496
benchmarkSubsetLoadBalancerCreate/0/64         2.16 ms         2.16 ms          327
benchmarkSubsetLoadBalancerCreate/1/64         1.45 ms         1.45 ms          479
benchmarkSubsetLoadBalancerCreate/0/512        14.7 ms         14.7 ms           48
benchmarkSubsetLoadBalancerCreate/1/512        4.15 ms         4.15 ms          162
benchmarkSubsetLoadBalancerCreate/0/2500        156 ms          156 ms            4
benchmarkSubsetLoadBalancerCreate/1/2500       17.5 ms         17.5 ms           43
benchmarkSubsetLoadBalancerUpdate/0/50        0.662 ms        0.662 ms         1150
benchmarkSubsetLoadBalancerUpdate/1/50        0.184 ms        0.184 ms         3694
benchmarkSubsetLoadBalancerUpdate/0/64        0.758 ms        0.758 ms          902
benchmarkSubsetLoadBalancerUpdate/1/64        0.218 ms        0.218 ms         2933
benchmarkSubsetLoadBalancerUpdate/0/512        13.5 ms         13.5 ms           51
benchmarkSubsetLoadBalancerUpdate/1/512        1.90 ms         1.90 ms          376
benchmarkSubsetLoadBalancerUpdate/0/2500        224 ms          224 ms            3
benchmarkSubsetLoadBalancerUpdate/1/2500      10.00 ms        10.00 ms           73

Benchmark result before this PR:

-----------------------------------------------------------------------------------
Benchmark                                         Time             CPU   Iterations
-----------------------------------------------------------------------------------
benchmarkSubsetLoadBalancerCreate/0/50         2.09 ms         2.09 ms          344
benchmarkSubsetLoadBalancerCreate/1/50         1.34 ms         1.34 ms          511
benchmarkSubsetLoadBalancerCreate/0/64         2.52 ms         2.51 ms          290
benchmarkSubsetLoadBalancerCreate/1/64         1.44 ms         1.44 ms          499
benchmarkSubsetLoadBalancerCreate/0/512        45.8 ms         45.8 ms           17
benchmarkSubsetLoadBalancerCreate/1/512        3.86 ms         3.86 ms          183
benchmarkSubsetLoadBalancerCreate/0/2500        871 ms          871 ms            1
benchmarkSubsetLoadBalancerCreate/1/2500       14.9 ms         14.9 ms           47
benchmarkSubsetLoadBalancerUpdate/0/50         1.13 ms         1.13 ms          628
benchmarkSubsetLoadBalancerUpdate/1/50        0.135 ms        0.135 ms         5385
benchmarkSubsetLoadBalancerUpdate/0/64         1.70 ms         1.70 ms          379
benchmarkSubsetLoadBalancerUpdate/1/64        0.166 ms        0.166 ms         4265
benchmarkSubsetLoadBalancerUpdate/0/512        78.8 ms         78.8 ms            9
benchmarkSubsetLoadBalancerUpdate/1/512        1.48 ms         1.48 ms          470
benchmarkSubsetLoadBalancerUpdate/0/2500       1716 ms         1716 ms            1
benchmarkSubsetLoadBalancerUpdate/1/2500       7.61 ms         7.61 ms           92

This performance got a minor worse in the single host mode and got a big improvement in normal mode.

Considering that for the multiple priorities host set, the new implementation needn't evaluate all priorities' hosts for every update and may get a better performance in single host mode, so I think this result is just ok.

Aug 10 '22 15:08 wbpcode

cc @ggreenway

Aug 11 '22 03:08 wbpcode

/retest

Aug 11 '22 09:08 wbpcode

Retrying Azure Pipelines: Check envoy-presubmit didn't fail.

:cat:

Caused by: a https://github.com/envoyproxy/envoy/pull/22644#issuecomment-1211723962 was created by @wbpcode.

see: more, trace.

Aug 11 '22 09:08 repokitteh-read-only[bot]

I'm going to be out the rest of this week so re-assigning to @zuercher for review.

I think the performance degradation for the single mode is probably ok, but I'm curious what the benchmark shows if you expand it to higher numbers of hosts (I mostly want to see what the algorithmic complexity of the old and new looks like).

Aug 16 '22 14:08 ggreenway

I'm going to be out the rest of this week so re-assigning to @zuercher for review.

I think the performance degradation for the single mode is probably ok, but I'm curious what the benchmark shows if you expand it to higher numbers of hosts (I mostly want to see what the algorithmic complexity of the old and new looks like).

I can run some more test to ensure it.

The algorithmic complexity should no big change except in some corner cases (e.g., many priorities and few hosts for every priority). 🤔

Aug 16 '22 15:08 wbpcode

friendly ping @zuercher

Aug 23 '22 08:08 wbpcode

More hosts benchmark test result. cc @ggreenway

Benchmark result after this PR:

------------------------------------------------------------------------------------
Benchmark                                          Time             CPU   Iterations
------------------------------------------------------------------------------------
benchmarkSubsetLoadBalancerCreate/0/50          1.29 ms         1.29 ms          502
benchmarkSubsetLoadBalancerCreate/1/50         0.967 ms        0.967 ms          752
benchmarkSubsetLoadBalancerCreate/0/64          1.44 ms         1.44 ms          492
benchmarkSubsetLoadBalancerCreate/1/64          1.04 ms         1.04 ms          695
benchmarkSubsetLoadBalancerCreate/0/512         10.9 ms         10.9 ms           64
benchmarkSubsetLoadBalancerCreate/1/512         3.08 ms         3.08 ms          225
benchmarkSubsetLoadBalancerCreate/0/4096         374 ms          374 ms            2
benchmarkSubsetLoadBalancerCreate/1/4096        20.1 ms         20.1 ms           35
benchmarkSubsetLoadBalancerCreate/0/10000       2373 ms         2373 ms            1
benchmarkSubsetLoadBalancerCreate/1/10000       50.0 ms         50.0 ms           12
benchmarkSubsetLoadBalancerUpdate/0/50         0.419 ms        0.419 ms         1614
benchmarkSubsetLoadBalancerUpdate/1/50         0.138 ms        0.138 ms         4984
benchmarkSubsetLoadBalancerUpdate/0/64         0.567 ms        0.567 ms         1207
benchmarkSubsetLoadBalancerUpdate/1/64         0.179 ms        0.179 ms         3935
benchmarkSubsetLoadBalancerUpdate/0/512         12.2 ms         12.2 ms           58
benchmarkSubsetLoadBalancerUpdate/1/512         1.60 ms         1.60 ms          451
benchmarkSubsetLoadBalancerUpdate/0/4096         583 ms          583 ms            1
benchmarkSubsetLoadBalancerUpdate/1/4096        13.5 ms         13.5 ms           52
benchmarkSubsetLoadBalancerUpdate/0/10000       3330 ms         3330 ms            1
benchmarkSubsetLoadBalancerUpdate/1/10000       34.1 ms         34.1 ms           21

Benchmark result before this PR:

------------------------------------------------------------------------------------
Benchmark                                          Time             CPU   Iterations
------------------------------------------------------------------------------------
benchmarkSubsetLoadBalancerCreate/0/50          1.54 ms         1.54 ms          460
benchmarkSubsetLoadBalancerCreate/1/50         0.952 ms        0.952 ms          714
benchmarkSubsetLoadBalancerCreate/0/64          1.86 ms         1.86 ms          362
benchmarkSubsetLoadBalancerCreate/1/64          1.01 ms         1.01 ms          692
benchmarkSubsetLoadBalancerCreate/0/512         46.6 ms         46.6 ms           16
benchmarkSubsetLoadBalancerCreate/1/512         3.14 ms         3.14 ms          235
benchmarkSubsetLoadBalancerCreate/0/4096        2070 ms         2070 ms            1
benchmarkSubsetLoadBalancerCreate/1/4096        19.9 ms         19.9 ms           36
benchmarkSubsetLoadBalancerCreate/0/10000      11388 ms        11388 ms            1
benchmarkSubsetLoadBalancerCreate/1/10000       49.0 ms         49.0 ms           13
benchmarkSubsetLoadBalancerUpdate/0/50         0.961 ms        0.961 ms          728
benchmarkSubsetLoadBalancerUpdate/1/50         0.106 ms        0.106 ms         6478
benchmarkSubsetLoadBalancerUpdate/0/64          1.43 ms         1.43 ms          506
benchmarkSubsetLoadBalancerUpdate/1/64         0.135 ms        0.135 ms         4246
benchmarkSubsetLoadBalancerUpdate/0/512         74.5 ms         74.5 ms           11
benchmarkSubsetLoadBalancerUpdate/1/512         1.26 ms         1.26 ms          558
benchmarkSubsetLoadBalancerUpdate/0/4096        3826 ms         3826 ms            1
benchmarkSubsetLoadBalancerUpdate/1/4096        11.2 ms         11.2 ms           66
benchmarkSubsetLoadBalancerUpdate/0/10000      21114 ms        21114 ms            1
benchmarkSubsetLoadBalancerUpdate/1/10000       27.4 ms         27.4 ms           26

Aug 25 '22 03:08 wbpcode

main merged and conflict resolved. cc @zuercher

Aug 25 '22 13:08 wbpcode

friendly ping @zuercher could you give a LGTM again. Then we can merge this PR. 😸

Aug 29 '22 10:08 wbpcode

envoy envoy copied to clipboard

subset lb: added multiple keys or multiple selectors support for single host per subset mode

envoy
envoy copied to clipboard