thanos
thanos copied to clipboard
Query Frontend: add query instant tripperware for query sharding
Signed-off-by: Ben Ye [email protected]
- [ ] I added CHANGELOG entry for this change.
- [ ] Change is not relevant to the end user.
Changes
This pr adds a simple tripperware for sharding instant queries.
Verification
One question, do you think it makes sense to have separate sharding number for query range and query instant? It would mean we change --query-frontend.vertical-shards
to
--query-range.vertical-shards
--query-instant.vertical-shards
which is a "breaking change", but we also haven't released the feature yet.
One question, do you think it makes sense to have separate sharding number for query range and query instant? It would mean we change
--query-frontend.vertical-shards
to--query-range.vertical-shards --query-instant.vertical-shards
which is a "breaking change", but we also haven't released the feature yet.
What about using the same NumShards
flag?
type Config struct {
QueryRangeConfig
LabelsConfig
DownstreamTripperConfig
CortexHandlerConfig *transport.HandlerConfig
CompressResponses bool
CacheCompression string
RequestLoggingDecision string
DownstreamURL string
ForwardHeaders []string
NumShards int
}
Now the NumShards
is defined in the root level instead of inside QueryRangeConfig
so I think it is totally okay to use it in instant queries.
Maintaining two flags are also okay but I cannot think of a usecase for setting vertical shards differently for different query types.
Got it. Let's start with a single flag and we can revisit if a use case comes up.
After we vendor the cortex package internally, is there a way to import the cortex proto file we vendored? Tried multiple times but didn't find a way to get it working with existing genproto.sh
script.
After we vendor the cortex package internally, is there a way to import the cortex proto file we vendored? Tried multiple times but didn't find a way to get it working with existing
genproto.sh
script.
Yeah, I think it hasn't been updated. You probably need to add -I
option that points to the Cortex directory.
Still cannot get it working by adding the Cortex dir to the proto dependency. Wondering what's the future plan on this. Are we allowed to modify the vendored Cortex code for some use cases? We may need to build the Cortex proto ourselves.
Could you share the command that fails for you currently?
Could you share the command that fails for you currently?
Thanks. I figured out a way to compile the Cortex protobuf instead and make changes on it directly. Not sure if it is a acceptable way but I got it working finally. Seems query response parsing and instant query sharding are working properly on my local tests.
I will add more tests soon.
This is ready for another round of review. @fpetkovski @GiedriusS
This looks good, my only concern is that we need to modify the vendored Cortex files. I am not sure though what's the strategy for keeping up to date with upstream Cortex.
Yeah, I see your concern. I am not sure about this as well. If needed I can try to avoid modifying the Cortex code. But this causes more code duplication. For example, we need to copy Cortex unexposed functions like merging query stats, etc.
Even though it's not ideal for syncing up with the upstream Cortex, we'll need to modify the vendored Cortex files. If I'm not mistaken, @GiedriusS also working on a PR that modifies the vendored files. It'll be a half-automated process to upgrade Cortex in Thanos, but I don't see any other options. If someone has a brilliant idea to do this, please let us know :)
Amazing stuff, one of our instant queries went from 36s to 8s.
Whoops, I didn't expect the PR got merged so quickly. I haven't added some unit tests and changelog, too. I will add them in follow up prs.
One important thing I want to make sure is to have PromQL compatibility test for Thanos Query behind the query frontend to make sure no regression. Since the pr is already large I will do it in another pr, too.