cortex-jsonnet
cortex-jsonnet copied to clipboard
Bump gRPC max tx/rx to 100MB for ingester and distributor
This just changes the send and receive limits to match what was configured in the query-frontend
(and changes the definition of 100MB from 100 << 20
to 1024 * 1024 * 100
. My logs are full of ResourceExhausted desc = trying to send message larger than max
, and I'm guessing it's just an oversight that the ingester and distributor didn't get their default limits bumped to match.
Thanks @amckinley for opening this PR and raising the discussion around gRPC message size limit. The current config is not an oversight, but we're aware of cases when the limit can be hit. Let me do a step back.
Cortex internally uses gRPC to communicate between different services. Different services use gRPC to transfer different type of data; for some communication we use gRPC streaming (which suffers less this issue) and for other we don't.
In your setup you can increase the limits as a quick workaround, but I don't think it's wise to raise all the limits to 100MB by default. On the contrary, we should understand which channel and why reach the limit. If this happens between the ingester and querier when running the blocks storage, then it's a known issue we want to work on (https://github.com/cortexproject/cortex/issues/2945) so my suggestion would be to override it in your setup but not change the default here. If it happens anywhere else, please us know where so we can further investigate it.
Hi @pracucci, what is the purpose of these limits? It doesn't look like Cortex is capable of "chunking" any of the data it returns, so hitting these limits just causes hard failures. In my deployment, I've been forced to just keep increasing these limits every time I find a new Grafana dashboard that refuses to render because of max gRPC size. Most recently we hit grpc: trying to send message larger than max (219597294 vs. 104857600)
when trying to render a dashboard that parameterizes on k8s namespace, in a cluster where we have ~1000 unique namespaces. Wouldn't it be better to leave these limits uncapped everywhere?
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.