promql-engine icon indicating copy to clipboard operation
promql-engine copied to clipboard

Fuzz: Distributed query test case failure

Open yeya24 opened this issue 1 year ago • 5 comments

https://github.com/thanos-io/promql-engine/actions/runs/7394293573/job/20115438270

--- FAIL: FuzzDistributedEnginePromQLSmithRangeQuery (0.70s)
    --- FAIL: FuzzDistributedEnginePromQLSmithRangeQuery/seed#0 (0.70s)
        enginefuzz_test.go:443: load 30s
            			http_requests_total{pod="nginx-1", route="/"} 1.00+1.00x4
            			http_requests_total{pod="nginx-2", route="/"}  1+2.00x4
        enginefuzz_test.go:444: ceil(
              -(
                  {__name__="http_requests_total"} @ 77.320 offset 1m10s
                -
                  {__name__="http_requests_total"} offset -2m48s
              )
            )
        enginefuzz_test.go:446: case 24 error mismatch.
            new result: {pod="nginx-1", route="/"} =>
            -6 @[0]
            -6 @[30000]
            -6 @[60000]
            -6 @[90000]
            -6 @[[120](https://github.com/thanos-io/promql-engine/actions/runs/7394293573/job/20115438270#step:4:121)000]
            {pod="nginx-2", route="/"} =>
            -10 @[0]
            -10 @[30000]
            -10 @[60000]
            -10 @[90000]
            -10 @[120000]
            old result: {pod="nginx-1", route="/"} =>
            4 @[0]
            4 @[30000]
            4 @[60000]
            4 @[90000]
            4 @[120000]
            {pod="nginx-2", route="/"} =>
            8 @[0]
            8 @[30000]
            8 @[60000]
            8 @[90000]
            8 @[120000]
        enginefuzz_test.go:451: failed 1 test cases

Not sure if it is really a bug but the mismatched results seems concerning

yeya24 avatar Jan 03 '24 07:01 yeya24

Distributed Engine should be fuzzed without offset or @ i think

MichaHoffmann avatar Jan 03 '24 07:01 MichaHoffmann

@MichaHoffmann Is @ and offset not supported for distributed query engine?

If that's the case then thanos engine should return error and fallback to the old engine then

yeya24 avatar Jan 03 '24 07:01 yeya24

They are but iirc they are truncated ( here https://github.com/thanos-io/promql-engine/blob/662ae7d6e27dac1c18ab4a7deb19add136f51367/engine/distributed.go#L73 )

MichaHoffmann avatar Jan 03 '24 07:01 MichaHoffmann

It looks like this shouldn't affect @? As @ is at an absolute timestamp. At least in this example, Idk if it is the cause of this failed test case.

yeya24 avatar Jan 03 '24 07:01 yeya24

The engines are created with "zone" external labels but the actual series in them do not have the external label, i think that might be the issue.

MichaHoffmann avatar Jan 03 '24 07:01 MichaHoffmann