promql-engine icon indicating copy to clipboard operation
promql-engine copied to clipboard

execution: partial responses in distributed engine

Open MichaHoffmann opened this issue 1 year ago • 2 comments

For very distributed setups we need to be able to deal with partial failures. This commit adds an option to continue evaulation if we encounter an error in a remote engine but dont want to fail the whole query.

MichaHoffmann avatar Aug 12 '24 12:08 MichaHoffmann

I am not sure if this belongs in the engine to be fair. We already have this feature in Thanos so we could convert error responses to warnings when executing remote queries: https://github.com/thanos-io/thanos/blob/main/pkg/query/remote_engine.go#L231

fpetkovski avatar Aug 15 '24 06:08 fpetkovski

I am not sure if this belongs in the engine to be fair. We already have this feature in Thanos so we could convert error responses to warnings when executing remote queries: https://github.com/thanos-io/thanos/blob/main/pkg/query/remote_engine.go#L231

Im kinda going back and forth about this in my head, but currently I feel like because the query engine orchestrates all the requests, it is the right place to make decisions about partial responses too. But this is mostly vibe based, just feels better to me right now.

MichaHoffmann avatar Sep 01 '24 15:09 MichaHoffmann