mimir Cross-tenant queries produce cardinality estimation cache keys that are too long

Describe the bug

When running cross-tenant queries with the query-frontend results cache enabled, the query caching middleware attempts to concatenate all tenant IDs into the key (code). There's a 250-byte limit on memcached keys (code), so it's possible to have enough tenants (or long enough tenant names) that the key ends up exceeding that 250-byte limit, at which point gets will fail.

To Reproduce

Steps to reproduce the behavior:

Create multiple tenants whose names, concatenated, exceed 250 bytes in length.
Query across all tenants using the wildcard tenant.
See malformed key logs on memcached fetches.

Expected behavior

Keys should be short enough that they don't exceed the 250-byte limit. Perhaps that means that cache keys should be created and looked up on a per-tenant basis?

Dec 13 '23 23:12 chencs

One potential solution to this is to sort the tenant IDs (the parsing logic might already do this) and hash them and use the hash in the cache key instead of the tenant IDs directly.

Dec 14 '23 02:12 56quarters

the cache key is then hashed and that's used to key memcached. The full key is stored as part of the item to check for hash collisions, but it's not included verbatim in the cache key.

https://github.com/grafana/mimir/blob/efa26e743027bf495c340a893939b2671ba05959/pkg/frontend/querymiddleware/split_and_cache.go#L458

Dec 14 '23 13:12 dimitarvdimitrov

Ah, you're right, and then the hashed key gets read back out here, I missed that. So this isn't a problem for the results cache, but it seems like it is a problem for a cache somewhere, likely the cardinality estimation cache, since I do see the QS prefix in logs.

Dec 14 '23 17:12 chencs

It seems we have been hitting the same issue since adding new tenants 2 days back

We have the below in logs for query-frontend

{"time":"2023-12-19T17:27:04.952051693Z","message":"ts=2023-12-19T17:27:04.950734513Z caller=memcached_client.go:335 level=warn name=frontend-cache msg="failed to fetch items from memcached" numKeys=1 firstKey=1@QS:||...|<tenantX>:819455e299e84fc0:236529:0 err="malformed: key is too long or contains invalid characters"", "tenant":"somename"}

When I check the length of all the tenants, it is exceeding 250 characters @chencs Are you seeing similar log entries ?

Dec 21 '23 07:12 rnerkar

@rnerkar yes, except my logs don't include that tenant key. My understanding is that the cardinality estimation cache is used to limit the number of shards a query can be broken into for parallelization (code).

From reading this issue describing the goals for implementing cardinality estimation for query sharding, I believe this means that the failure mode (not being able to fetch cardinality estimation cache keys) potentially results in unnecessary over-sharding (a nuisance with respect to overhead), but not unnecessary under-sharding (OOMs resulting in query failure).

Dec 21 '23 19:12 chencs

Thanks for explaining @chencs So with the understanding that there are no adverse effects on Mimir's performance with these errors, except for the fact that small queries will get over-sharded Do we have a fix coming in, or something to be done from our end? We have had talks about using redis instead of Memcached to get around this issue but that's still not finalized Will wait till I hear back from you guys. Thanks in advance !

Dec 26 '23 06:12 rnerkar

Do we have a fix coming in, or something to be done from our end?

We apply a hash to keys to make them a consistent length elsewhere, we'll do the same here. No ETA but it should be a simple fix.

We have had talks about using redis instead of Memcached to get around this issue but that's still not finalized Will wait till I hear back from you guys. Thanks in advance !

We don't recommend using Redis since we (Grafana) don't use it in production and it's not part of the Helm chart or Jsonnet.

Jan 02 '24 13:01 56quarters

We're hitting this issue as well.

Note that it also seems to increment thanos_memcached_operation_failures_total, so we actually get alerted on this.

Mar 07 '24 08:03 nabadger

mimir mimir copied to clipboard

Cross-tenant queries produce cardinality estimation cache keys that are too long

Describe the bug

To Reproduce

Expected behavior

mimir
mimir copied to clipboard