codex-blackboard icon indicating copy to clipboard operation
codex-blackboard copied to clipboard

Blackboard doesn't handle restarts well

Open cscott opened this issue 11 years ago • 5 comments

Perhaps clients are somehow doing a really expensive subscription temporarily on restart? Or is this just a bug.

Filed as https://github.com/meteor/meteor/issues/1770

cscott avatar Jan 19 '14 18:01 cscott

This is really our biggest biggest problem with using meteor at the moment. :(

cscott avatar Jan 17 '15 05:01 cscott

I assume one (if not the only) problem at startup in particular is the observeChanges call used for the Hubot integration has no filter, so the server has to load everything that hasn't been archived to OldMessages just to ignore it. If you added an index on just message timestamp and made that observe filter on timestamp > now, the initial load would be instant because there would be no messages matching the filter, but new messages would match and get processed. That might even make it so you don't need OldMessages anymore.

Torgen avatar Dec 09 '18 19:12 Torgen

I think I found your expensive subscription: messages-in-range-to-me. I did a query explain on messages.find({room_name: 'general/0', timestamp: { $gt: 45}, $or: [ {nick: 'torgen'}, {to: 'torgen'} ]} and the winning plan was:

"stage" : "FETCH",
"filter" : {
  "$and" : [
    {
      "room_name" : {
        "$eq" : "general/0"
      }
    },
    {
      "timestamp" : {
        "$gt" : 45
      }
    }
  ]
},
"inputStage" : {
  "stage" : "OR",
  "inputStages" : [
    {
      "stage" : "IXSCAN",
      "keyPattern" : {
        "to" : 1,
        "room_name" : 1,
        "timestamp" : -1
      },
      "indexName" : "to_1_room_name_1_timestamp_-1",
      "isMultiKey" : false,
      "multiKeyPaths" : {
        "to" : [ ],
        "room_name" : [ ],
        "timestamp" : [ ]
      },
      "isUnique" : false,
      "isSparse" : false,
      "isPartial" : false,
      "indexVersion" : 2,
      "direction" : "forward",
      "indexBounds" : {
        "to" : [
          "[\"torgen\", \"torgen\"]"
        ],
        "room_name" : [
          "[MinKey, MaxKey]"
        ],
        "timestamp" : [
          "[MaxKey, MinKey]"
        ]
      }
    },
    {
      "stage" : "IXSCAN",
      "keyPattern" : {
        "nick" : 1,
        "room_name" : 1,
        "timestamp" : -1
      },
      "indexName" : "nick_1_room_name_1_timestamp_-1",
      "isMultiKey" : false,
      "multiKeyPaths" : {
        "nick" : [ ],
        "room_name" : [ ],
        "timestamp" : [ ]
      },
      "isUnique" : false,
      "isSparse" : false,
      "isPartial" : false,
      "indexVersion" : 2,
      "direction" : "forward",
      "indexBounds" : {
        "nick" : [
                "[\"torgen\", \"torgen\"]"
        ],
        "room_name" : [
                "[MinKey, MaxKey]"
        ],
        "timestamp" : [
                "[MaxKey, MinKey]"
        ]
      }
    }
  ]
}

Notice the index bounds for both room_name and timestamp. Here's the winning query plan of messages.find({room_name: 'general/0', timestamp: {$gt: 45}, nick: 'torgen'}):

"stage" : "FETCH",
"inputStage" : {
  "stage" : "IXSCAN",
  "keyPattern" : {
    "nick" : 1,
    "room_name" : 1,
    "timestamp" : -1
  },
  "indexName" : "nick_1_room_name_1_timestamp_-1",
  "isMultiKey" : false,
  "multiKeyPaths" : {
    "nick" : [ ],
    "room_name" : [ ],
    "timestamp" : [ ]
  },
  "isUnique" : false,
  "isSparse" : false,
  "isPartial" : false,
  "indexVersion" : 2,
  "direction" : "forward",
  "indexBounds" : {
    "nick" : [
      "[\"torgen\", \"torgen\"]"
    ],
    "room_name" : [
      "[\"general/0\", \"general/0\"]"
    ],
    "timestamp" : [
      "[inf.0, 45.0)"
    ]
  }
}

Much better index bounds for room_name and timestamp. The plan for { to: 'torgen' } is just as good. In fact if you make it to: $in: [null, 'torgen'] it's still good.

Torgen avatar Jan 09 '19 07:01 Torgen

If you distribute the common query parameters into the $or, you get a good plan:

"stage" : "SUBPLAN",
"inputStage" : {
  "stage" : "FETCH",
  "inputStage" : {
    "stage" : "OR",
    "inputStages" : [
      {
        "stage" : "IXSCAN",
        "keyPattern" : {
          "to" : 1,
          "room_name" : 1,
          "timestamp" : -1
        },
        "indexName" : "to_1_room_name_1_timestamp_-1",
        "isMultiKey" : false,
        "multiKeyPaths" : {
          "to" : [ ],
          "room_name" : [ ],
          "timestamp" : [ ]
        },
        "isUnique" : false,
        "isSparse" : false,
        "isPartial" : false,
        "indexVersion" : 2,
        "direction" : "forward",
        "indexBounds" : {
          "to" : [
            "[\"torgen\", \"torgen\"]"
          ],
          "room_name" : [
            "[\"general/0\", \"general/0\"]"
          ],
          "timestamp" : [
            "[inf.0, 45.0)"
          ]
        }
      },
      {
        "stage" : "IXSCAN",
        "keyPattern" : {
          "nick" : 1,
          "room_name" : 1,
          "timestamp" : -1
        },
        "indexName" : "nick_1_room_name_1_timestamp_-1",
        "isMultiKey" : false,
        "multiKeyPaths" : {
          "nick" : [ ],
          "room_name" : [ ],
          "timestamp" : [ ]
        },
        "isUnique" : false,
        "isSparse" : false,
        "isPartial" : false,
        "indexVersion" : 2,
        "direction" : "forward",
        "indexBounds" : {
          "nick" : [
                  "[\"torgen\", \"torgen\"]"
          ],
          "room_name" : [
                  "[\"general/0\", \"general/0\"]"
          ],
          "timestamp" : [
                  "[inf.0, 45.0)"
          ]
        }
      }
    ]
  }
}

So the problem seems to be that mongodb's query planner doesn't understand the distributive property.

Torgen avatar Jan 09 '19 15:01 Torgen

Tried explain('executionStats') on both versions with some actual data. Possibly because there's some overlap between from someone and to nobody, the defactored version was slightly slower, but it was like 2ms to 1ms, so maybe MongoDB knows what it's doing, but that does mean that the nick + room_name + timestamp index is useless.

Torgen avatar Jan 12 '19 06:01 Torgen