alertmanager icon indicating copy to clipboard operation
alertmanager copied to clipboard

Add multiple sources for inhibition rules

Open coleenquadros opened this issue 2 months ago • 9 comments

https://github.com/prometheus/alertmanager/issues/4504

benchmark

goos: darwin
goarch: arm64
pkg: github.com/prometheus/alertmanager/inhibit
cpu: Apple M1 Pro
                                                                  │ benchmark-inhibit-main.txt │               ms.txt                │
                                                                  │           sec/op           │    sec/op     vs base               │
Mutes/1_inhibition_rule,_1_inhibiting_alert-10                                    871.5n ±  4%   887.4n ± 26%        ~ (p=0.073 n=7)
Mutes/10_inhibition_rules,_1_inhibiting_alert-10                                  844.1n ±  2%   849.4n ±  5%        ~ (p=0.097 n=7)
Mutes/100_inhibition_rules,_1_inhibiting_alert-10                                 888.2n ±  1%   894.4n ±  2%   +0.70% (p=0.019 n=7)
Mutes/1000_inhibition_rules,_1_inhibiting_alert-10                                1.006µ ±  3%   1.012µ ±  1%        ~ (p=0.223 n=7)
Mutes/10000_inhibition_rules,_1_inhibiting_alert-10                              1229.0n ± 24%   989.0n ±  5%  -19.53% (p=0.002 n=7)
Mutes/1_inhibition_rule,_10_inhibiting_alerts-10                                  929.0n ±  3%   922.3n ±  2%        ~ (p=0.053 n=7)
Mutes/1_inhibition_rule,_100_inhibiting_alerts-10                                 926.1n ±  2%   943.5n ±  1%   +1.88% (p=0.001 n=7)
Mutes/1_inhibition_rule,_1000_inhibiting_alerts-10                                932.5n ±  2%   948.2n ±  3%   +1.68% (p=0.007 n=7)
Mutes/1_inhibition_rule,_10000_inhibiting_alerts-10                               919.9n ±  2%   917.3n ±  4%        ~ (p=0.535 n=7)
Mutes/100_inhibition_rules,_1000_inhibiting_alerts-10                             888.1n ±  1%   892.4n ±  3%        ~ (p=0.620 n=7)
Mutes/10_inhibition_rules,_last_rule_matches-10                                   2.502µ ±  1%   2.592µ ±  5%   +3.60% (p=0.001 n=7)
Mutes/100_inhibition_rules,_last_rule_matches-10                                  18.32µ ±  2%   19.05µ ±  1%   +3.98% (p=0.001 n=7)
Mutes/1000_inhibition_rules,_last_rule_matches-10                                 175.9µ ±  1%   182.3µ ±  5%   +3.67% (p=0.001 n=7)
Mutes/10000_inhibition_rules,_last_rule_matches-10                                1.793m ±  2%   1.929m ±  4%   +7.58% (p=0.001 n=7)
Mutes/10_inhibition_rules,_5_sources,_100_inhibiting_alerts-10                                   2.278µ ±  1%
Mutes/100_inhibition_rules,_10_sources,_1000_inhibiting_alerts-10                                3.593µ ±  2%
Mutes/1000_inhibition_rules,_20_sources,_100_inhibiting_alerts-10                                7.477µ ± 14%
geomean                                                                           3.103µ         3.243µ         +0.24%

                                                                  │ benchmark-inhibit-main.txt │                ms.txt                │
                                                                  │            B/op            │     B/op      vs base                │
Mutes/1_inhibition_rule,_1_inhibiting_alert-10                                      488.0 ± 0%     488.0 ± 0%       ~ (p=1.000 n=7) ¹
Mutes/10_inhibition_rules,_1_inhibiting_alert-10                                    488.0 ± 0%     488.0 ± 0%       ~ (p=1.000 n=7) ¹
Mutes/100_inhibition_rules,_1_inhibiting_alert-10                                   488.0 ± 0%     488.0 ± 0%       ~ (p=1.000 n=7) ¹
Mutes/1000_inhibition_rules,_1_inhibiting_alert-10                                  488.0 ± 0%     488.0 ± 0%       ~ (p=1.000 n=7) ¹
Mutes/10000_inhibition_rules,_1_inhibiting_alert-10                                 489.0 ± 0%     488.0 ± 0%  -0.20% (p=0.001 n=7)
Mutes/1_inhibition_rule,_10_inhibiting_alerts-10                                    488.0 ± 0%     488.0 ± 0%       ~ (p=1.000 n=7) ¹
Mutes/1_inhibition_rule,_100_inhibiting_alerts-10                                   488.0 ± 0%     488.0 ± 0%       ~ (p=1.000 n=7) ¹
Mutes/1_inhibition_rule,_1000_inhibiting_alerts-10                                  488.0 ± 0%     488.0 ± 0%       ~ (p=1.000 n=7) ¹
Mutes/1_inhibition_rule,_10000_inhibiting_alerts-10                                 488.0 ± 0%     488.0 ± 0%       ~ (p=1.000 n=7) ¹
Mutes/100_inhibition_rules,_1000_inhibiting_alerts-10                               488.0 ± 0%     488.0 ± 0%       ~ (p=1.000 n=7) ¹
Mutes/10_inhibition_rules,_last_rule_matches-10                                     472.0 ± 0%     472.0 ± 0%       ~ (p=1.000 n=7) ¹
Mutes/100_inhibition_rules,_last_rule_matches-10                                    472.0 ± 0%     472.0 ± 0%       ~ (p=1.000 n=7) ¹
Mutes/1000_inhibition_rules,_last_rule_matches-10                                   472.0 ± 0%     472.0 ± 0%       ~ (p=1.000 n=7) ¹
Mutes/10000_inhibition_rules,_last_rule_matches-10                                  475.0 ± 0%     476.0 ± 0%       ~ (p=0.103 n=7)
Mutes/10_inhibition_rules,_5_sources,_100_inhibiting_alerts-10                                   1.070Ki ± 0%
Mutes/100_inhibition_rules,_10_sources,_1000_inhibiting_alerts-10                                1.789Ki ± 0%
Mutes/1000_inhibition_rules,_20_sources,_100_inhibiting_alerts-10                                4.845Ki ± 1%
"main-ms.txt" 68L, 8042B

coleenquadros avatar Nov 07 '25 16:11 coleenquadros

This sounds like it may need some documentation changes as well, so people know it exists and to use it?

ultrotter avatar Nov 07 '25 16:11 ultrotter

I closed #4504 since same result could be achieved using regex matching. Do you think this would provide any advantage on top of regex matching?

siavashs avatar Nov 10 '25 17:11 siavashs

Could you run the benchmarks and also add new benchmarks with multiple source matchers?

siavashs avatar Nov 10 '25 17:11 siavashs

goos: darwin
goarch: arm64
pkg: github.com/prometheus/alertmanager/inhibit
cpu: Apple M1 Pro
BenchmarkMutes
BenchmarkMutes/1_inhibition_rule,_1_inhibiting_alert
BenchmarkMutes/1_inhibition_rule,_1_inhibiting_alert-10         	 1399873	       854.7 ns/op
BenchmarkMutes/10_inhibition_rules,_1_inhibiting_alert
BenchmarkMutes/10_inhibition_rules,_1_inhibiting_alert-10       	 1375345	       864.3 ns/op
BenchmarkMutes/100_inhibition_rules,_1_inhibiting_alert
BenchmarkMutes/100_inhibition_rules,_1_inhibiting_alert-10      	 1303854	       898.8 ns/op
BenchmarkMutes/1000_inhibition_rules,_1_inhibiting_alert
BenchmarkMutes/1000_inhibition_rules,_1_inhibiting_alert-10     	 1160338	      1011 ns/op
BenchmarkMutes/10000_inhibition_rules,_1_inhibiting_alert
BenchmarkMutes/10000_inhibition_rules,_1_inhibiting_alert-10    	 1140762	      1034 ns/op
BenchmarkMutes/1_inhibition_rule,_10_inhibiting_alerts
BenchmarkMutes/1_inhibition_rule,_10_inhibiting_alerts-10       	 1267029	       965.7 ns/op
BenchmarkMutes/1_inhibition_rule,_100_inhibiting_alerts
BenchmarkMutes/1_inhibition_rule,_100_inhibiting_alerts-10      	 1208812	       980.3 ns/op
BenchmarkMutes/1_inhibition_rule,_1000_inhibiting_alerts
BenchmarkMutes/1_inhibition_rule,_1000_inhibiting_alerts-10     	 1252640	       946.8 ns/op
BenchmarkMutes/1_inhibition_rule,_10000_inhibiting_alerts
BenchmarkMutes/1_inhibition_rule,_10000_inhibiting_alerts-10    	 1275914	       929.5 ns/op
BenchmarkMutes/100_inhibition_rules,_1000_inhibiting_alerts
BenchmarkMutes/100_inhibition_rules,_1000_inhibiting_alerts-10  	 1381504	       866.2 ns/op
BenchmarkMutes/10_inhibition_rules,_last_rule_matches
BenchmarkMutes/10_inhibition_rules,_last_rule_matches-10        	  435414	      2673 ns/op
BenchmarkMutes/100_inhibition_rules,_last_rule_matches
BenchmarkMutes/100_inhibition_rules,_last_rule_matches-10       	   63754	     19681 ns/op
BenchmarkMutes/1000_inhibition_rules,_last_rule_matches
BenchmarkMutes/1000_inhibition_rules,_last_rule_matches-10      	    6380	    183355 ns/op
BenchmarkMutes/10000_inhibition_rules,_last_rule_matches
BenchmarkMutes/10000_inhibition_rules,_last_rule_matches-10     	     570	   1886689 ns/op
**_BenchmarkMutes/10_inhibition_rules,_5_sources,_100_inhibiting_alerts
BenchmarkMutes/10_inhibition_rules,_5_sources,_100_inhibiting_alerts-10         	  520171	      2277 ns/op
BenchmarkMutes/100_inhibition_rules,_10_sources,_1000_inhibiting_alerts
BenchmarkMutes/100_inhibition_rules,_10_sources,_1000_inhibiting_alerts-10      	  335876	      3696 ns/op
BenchmarkMutes/1000_inhibition_rules,_20_sources,_100_inhibiting_alerts
BenchmarkMutes/1000_inhibition_rules,_20_sources,_100_inhibiting_alerts-10      	  193414	      7105 ns/op_**
PASS

coleenquadros avatar Nov 11 '25 11:11 coleenquadros

Could you run the benchmarks before and after, and then submit rather the output of benchstat, please? Otherwise it's a bit hard to compare...

ultrotter avatar Nov 11 '25 12:11 ultrotter

@ultrotter can you help me understand what do you mean by before and after? The multiple sources feature is implemented for the first time. So I am not sure what we need to compare?

coleenquadros avatar Nov 11 '25 12:11 coleenquadros

@ultrotter can you help me understand what do you mean by before and after? The multiple sources feature is implemented for the first time. So I am not sure what we need to compare?

@coleenquadros Please follow these steps:

  1. Run benchmarks on your branch: go test -bench=. -run='^$' -count 7 -benchmem ./inhibit/ | tee benchmark-inhibit-multiple-source-matchers.txt
  2. Run benchmarks on main branch: go test -bench=. -run='^$' -count 7 -benchmem ./inhibit/ | tee benchmark-inhibit-main.txt
  3. Finally compare the benchmarks: benchstat benchmark-inhibit-main.txt benchmark-inhibit-multiple-source-matchers.txt
  4. Post the output in a code block in PR description.

siavashs avatar Nov 11 '25 12:11 siavashs

The idea is to compare the benchmarks as ran without the patch applied, with the benchmarks ran after you applied it, to make sure the current use cases/code paths are not negatively affected by the feature as it's implemented.

So something like:

git checkout main
git pull
go test -bench=. -benchmem -run=^$ -benchtime=2s -count=5 ./inhibit/... | tee 0.txt
git checkout YOURBRANCH
go test -bench=. -benchmem -run=^$ -benchtime=2s -count=5 ./inhibit/... | tee 1.txt

benchstat 0.txt 1.txt > 0-1.txt # can be installed with go install golang.org/x/perf/cmd/benchstat@latest

Thanks!

ultrotter avatar Nov 11 '25 12:11 ultrotter

goos: darwin
goarch: arm64
pkg: github.com/prometheus/alertmanager/inhibit
cpu: Apple M1 Pro
                                                                  │ benchmark-inhibit-main.txt │ benchmark-inhibit-multiple-source-matchers.txt │
                                                                  │           sec/op           │          sec/op           vs base              │
Mutes/1_inhibition_rule,_1_inhibiting_alert-10                                    837.3n ±  2%               844.8n ±  2%       ~ (p=0.383 n=7)
Mutes/10_inhibition_rules,_1_inhibiting_alert-10                                  843.7n ±  4%               840.3n ±  2%       ~ (p=0.073 n=7)
Mutes/100_inhibition_rules,_1_inhibiting_alert-10                                 886.9n ±  1%               903.8n ±  7%       ~ (p=0.165 n=7)
Mutes/1000_inhibition_rules,_1_inhibiting_alert-10                                1.013µ ±  1%               1.008µ ±  1%       ~ (p=0.415 n=7)
Mutes/10000_inhibition_rules,_1_inhibiting_alert-10                              1030.0n ± 25%               987.1n ±  2%  -4.17% (p=0.001 n=7)
Mutes/1_inhibition_rule,_10_inhibiting_alerts-10                                  922.1n ±  1%               920.4n ±  2%       ~ (p=0.620 n=7)
Mutes/1_inhibition_rule,_100_inhibiting_alerts-10                                 922.7n ±  5%               918.8n ±  2%  -0.42% (p=0.016 n=7)
Mutes/1_inhibition_rule,_1000_inhibiting_alerts-10                                948.9n ±  2%               929.4n ±  2%       ~ (p=0.364 n=7)
Mutes/1_inhibition_rule,_10000_inhibiting_alerts-10                               897.7n ±  1%               914.7n ±  2%  +1.89% (p=0.007 n=7)
Mutes/100_inhibition_rules,_1000_inhibiting_alerts-10                             880.0n ±  5%               891.3n ±  2%       ~ (p=0.128 n=7)
Mutes/10_inhibition_rules,_last_rule_matches-10                                   2.486µ ±  1%               2.547µ ±  2%  +2.45% (p=0.001 n=7)
Mutes/100_inhibition_rules,_last_rule_matches-10                                  18.44µ ±  1%               18.77µ ±  1%  +1.80% (p=0.001 n=7)
Mutes/1000_inhibition_rules,_last_rule_matches-10                                 176.9µ ±  1%               178.9µ ±  1%  +1.11% (p=0.007 n=7)
Mutes/10000_inhibition_rules,_last_rule_matches-10                                1.826m ±  3%               1.846m ±  4%       ~ (p=0.128 n=7)
Mutes/10_inhibition_rules,_5_sources,_100_inhibiting_alerts-10                                               2.237µ ±  1%
Mutes/100_inhibition_rules,_10_sources,_1000_inhibiting_alerts-10                                            3.543µ ±  1%
Mutes/1000_inhibition_rules,_20_sources,_100_inhibiting_alerts-10                                            7.054µ ± 65%
geomean                                                                           3.055µ                     3.187µ        +0.32%

                                                                  │ benchmark-inhibit-main.txt │ benchmark-inhibit-multiple-source-matchers.txt │
                                                                  │            B/op            │          B/op           vs base                │
Mutes/1_inhibition_rule,_1_inhibiting_alert-10                                      488.0 ± 0%              488.0 ±  0%       ~ (p=1.000 n=7) ¹
Mutes/10_inhibition_rules,_1_inhibiting_alert-10                                    488.0 ± 0%              488.0 ±  0%       ~ (p=1.000 n=7) ¹
Mutes/100_inhibition_rules,_1_inhibiting_alert-10                                   488.0 ± 0%              488.0 ±  0%       ~ (p=1.000 n=7) ¹
Mutes/1000_inhibition_rules,_1_inhibiting_alert-10                                  488.0 ± 0%              488.0 ±  0%       ~ (p=1.000 n=7) ¹
Mutes/10000_inhibition_rules,_1_inhibiting_alert-10                                 489.0 ± 0%              489.0 ±  0%       ~ (p=1.000 n=7)
Mutes/1_inhibition_rule,_10_inhibiting_alerts-10                                    488.0 ± 0%              488.0 ±  0%       ~ (p=1.000 n=7) ¹
Mutes/1_inhibition_rule,_100_inhibiting_alerts-10                                   488.0 ± 0%              488.0 ±  0%       ~ (p=1.000 n=7) ¹
Mutes/1_inhibition_rule,_1000_inhibiting_alerts-10                                  488.0 ± 0%              488.0 ±  0%       ~ (p=1.000 n=7) ¹
Mutes/1_inhibition_rule,_10000_inhibiting_alerts-10                                 488.0 ± 0%              488.0 ±  0%       ~ (p=1.000 n=7) ¹
Mutes/100_inhibition_rules,_1000_inhibiting_alerts-10                               488.0 ± 0%              488.0 ±  0%       ~ (p=1.000 n=7) ¹
Mutes/10_inhibition_rules,_last_rule_matches-10                                     472.0 ± 0%              472.0 ±  0%       ~ (p=1.000 n=7) ¹
Mutes/100_inhibition_rules,_last_rule_matches-10                                    472.0 ± 0%              472.0 ±  0%       ~ (p=1.000 n=7) ¹
Mutes/1000_inhibition_rules,_last_rule_matches-10                                   472.0 ± 0%              472.0 ±  0%       ~ (p=1.000 n=7) ¹
Mutes/10000_inhibition_rules,_last_rule_matches-10                                  475.0 ± 1%              475.0 ±  0%       ~ (p=0.706 n=7)
Mutes/10_inhibition_rules,_5_sources,_100_inhibiting_alerts-10                                            1.070Ki ±  0%
Mutes/100_inhibition_rules,_10_sources,_1000_inhibiting_alerts-10                                         1.789Ki ±  0%
Mutes/1000_inhibition_rules,_20_sources,_100_inhibiting_alerts-10                                         4.793Ki ± 11%
geomean                                                                             483.7                   629.0        +0.00%
¹ all samples are equal

                                                                  │ benchmark-inhibit-main.txt │ benchmark-inhibit-multiple-source-matchers.txt │
                                                                  │         allocs/op          │       allocs/op         vs base                │
Mutes/1_inhibition_rule,_1_inhibiting_alert-10                                      10.00 ± 0%               10.00 ± 0%       ~ (p=1.000 n=7) ¹
Mutes/10_inhibition_rules,_1_inhibiting_alert-10                                    10.00 ± 0%               10.00 ± 0%       ~ (p=1.000 n=7) ¹
Mutes/100_inhibition_rules,_1_inhibiting_alert-10                                   10.00 ± 0%               10.00 ± 0%       ~ (p=1.000 n=7) ¹
Mutes/1000_inhibition_rules,_1_inhibiting_alert-10                                  10.00 ± 0%               10.00 ± 0%       ~ (p=1.000 n=7) ¹
Mutes/10000_inhibition_rules,_1_inhibiting_alert-10                                 10.00 ± 0%               10.00 ± 0%       ~ (p=1.000 n=7) ¹
Mutes/1_inhibition_rule,_10_inhibiting_alerts-10                                    10.00 ± 0%               10.00 ± 0%       ~ (p=1.000 n=7) ¹
Mutes/1_inhibition_rule,_100_inhibiting_alerts-10                                   10.00 ± 0%               10.00 ± 0%       ~ (p=1.000 n=7) ¹
Mutes/1_inhibition_rule,_1000_inhibiting_alerts-10                                  10.00 ± 0%               10.00 ± 0%       ~ (p=1.000 n=7) ¹
Mutes/1_inhibition_rule,_10000_inhibiting_alerts-10                                 10.00 ± 0%               10.00 ± 0%       ~ (p=1.000 n=7) ¹
Mutes/100_inhibition_rules,_1000_inhibiting_alerts-10                               10.00 ± 0%               10.00 ± 0%       ~ (p=1.000 n=7) ¹
Mutes/10_inhibition_rules,_last_rule_matches-10                                     10.00 ± 0%               10.00 ± 0%       ~ (p=1.000 n=7) ¹
Mutes/100_inhibition_rules,_last_rule_matches-10                                    10.00 ± 0%               10.00 ± 0%       ~ (p=1.000 n=7) ¹
Mutes/1000_inhibition_rules,_last_rule_matches-10                                   10.00 ± 0%               10.00 ± 0%       ~ (p=1.000 n=7) ¹
Mutes/10000_inhibition_rules,_last_rule_matches-10                                  10.00 ± 0%               10.00 ± 0%       ~ (p=1.000 n=7) ¹
Mutes/10_inhibition_rules,_5_sources,_100_inhibiting_alerts-10                                               29.00 ± 0%
Mutes/100_inhibition_rules,_10_sources,_1000_inhibiting_alerts-10                                            50.00 ± 0%
Mutes/1000_inhibition_rules,_20_sources,_100_inhibiting_alerts-10                                            127.0 ± 2%
geomean                                                                             10.00                    13.59       +0.00%
¹ all samples are equal
coquadro@coquadro-mac alertmanager % 

coleenquadros avatar Nov 28 '25 16:11 coleenquadros