mcrouter icon indicating copy to clipboard operation
mcrouter copied to clipboard

McRouter not forwarding gets / sets / deletes for largish values

Open danielbeardsley opened this issue 3 years ago • 2 comments

Mcrouter has been performing very well for us for years, but just recently we've started to notice a problem that's undermining our faith. Sometimes (for several minutes at a time) for large values, Mcrouter just stops forwarding sets / deletes (and some gets) for a particularly large value to both machines in our pool. For several minutes, sets and deletes counts are wayyyy off and the commands only make it to one of the two machines. At other times, they are even and for other slabs they are even.

Here's an example of the meta-data about one of the keys in question in memcache: key={the key name in question} exp=1631856704 la=1631856415 cas=1382190321 fetch=yes cls=39 size=831500

The size is 831kb (though under the 1MB memcache limit and we don't have values splitting turned on) and the expire time is ~5min.

Here's a graph of command counts between our two memcache machines in the pool. At other times and for other slabs, the values are nearly identical. But occasionally (maybe 1/day, but it's becoming more frequent) we see these imbalances that lead to almost constant cache misses (cause we use AllFastestRoute for gets).

image

A reduced view of our mcrouter config: image

Something of note, each time we see one of these anomalies Mcrouter seemingly randomly doesn't send the commands to one of the two machines (but not the same machine each time)

CC @djmetzle @andyg0808 @sctice-ifixit

danielbeardsley avatar Oct 07 '21 17:10 danielbeardsley

Hi there - My initial instinct is that it could be the server-timeout setting where the value is too large to transfer in the time window. Can you first check this setting?

https://github.com/facebook/mcrouter/blob/main/mcrouter/mcrouter_options_list.h#L617

Additionally, 831kb is not terribly large for this to not be able to handle. However, I have seen others setting a [smaller] value threshold via big value route (see https://github.com/facebook/mcrouter/blob/main/mcrouter/mcrouter_options_list.h#L119). This comes with a tradeoff of distributing pieces of the data across more than 1 machine however.

Also, if you could, share more of the options and its routing configuration (with proprietary info redacted please)? E.g. command line parameters, etc. This would be helpful to understand the problem better. An easy way is to use the preprocess config dump for the routing side:

https://github.com/facebook/mcrouter/wiki/Admin-requests#get-__mcrouter__preprocessed_config

Thanks!

djvaporize avatar Oct 08 '21 17:10 djvaporize

My initial instinct is that it could be the server-timeout setting where the value is too large to transfer in the time window. Can you first check this setting?

Man, that sounds like just the thing. But we're at the default of 1000ms, still I may experiment.

Here's the cli options:

usr/local/bin/mcrouter -f /var/run/mcrouter/mcrouter.conf -a /var/spool/mcrouter --port 11222 --probe-timeout-initial 100 --big-value-split-threshold 500000 --timeouts-until-tko 3 --use-asynclog-version2

And confirming that default value:

$> echo "get __mcrouter__.options" | nc 127.0.0.1 11222 | grep server_timeout
server_timeout_ms 1000

Note: To solve our issue, we added the --big-value-split-threshold though the underlying problem is still there, we've just side stepped it (831kb values weren't being propagated to both pools).

Here's our config:

{
  "pools": {
    "A": {
      "servers": [
        "10.0.1.X:11211"
      ]
    },
    "B": {
      "servers": [
        "10.0.1.Y:11211"
      ]
    }
  },
  "route": {
    "type": "OperationSelectorRoute",
    "operation_policies": {
      "get": {
        "type": "AllFastestRoute",
        "children": [
          "PoolRoute|A",
          "PoolRoute|B"
        ]
      },
      "set": {
        "type": "AllFastestRoute",
        "children": [
          "PoolRoute|A",
          "PoolRoute|B"
        ]
      },
      "add": {
        "type": "AllSyncRoute",
        "children": [
          "PoolRoute|A",
          "PoolRoute|B"
        ]
      },
      "delete": {
        "type": "AllAsyncRoute",
        "children": [
          "PoolRoute|A",
          "PoolRoute|B"
        ]
      },
      "incr": {
        "type": "AllSyncRoute",
        "children": [
          "PoolRoute|A",
          "PoolRoute|B"
        ]
      },
      "decr": {
        "type": "AllSyncRoute",
        "children": [
          "PoolRoute|A",
          "PoolRoute|B"
        ]
      }
    }
  }
}

danielbeardsley avatar Oct 17 '21 16:10 danielbeardsley