vitess icon indicating copy to clipboard operation
vitess copied to clipboard

Improve route merging for queries that have conditions on different vindexes, but can be merged via join predicates.

Open arthurschreiber opened this issue 2 years ago • 1 comments

Description

This pull request imrpoves query route merging for queries that consist of multiple joins (via vindex columns), where some of the joined tables also have additional vindex conditions defined.

An example can be found in https://github.com/vitessio/vitess/issues/10819, and in the test case that was added to go/vt/vtgate/planbuilder/testdata/select_cases.txt.

Previously, the Gen4 planner would break the query up into multiple pieces, one for each "direct" vindex condition defined, even though the query could be send to a single shard based on the join conditions between all tables.

For the query from the added test case, the plan generated by Gen4 would look like this:

{
  "QueryType": "SELECT",
  "Original": "SELECT user.id FROM user INNER JOIN music_extra ON user.id = music_extra.user_id INNER JOIN music ON music_extra.user_id = music.user_id WHERE user.id = 123 and music.id = 456",
  "Instructions": {
    "OperatorType": "Join",
    "Variant": "Join",
    "JoinColumnIndexes": "R:0",
    "JoinVars": {
      "music_user_id": 0
    },
    "TableName": "music_`user`, music_extra",
    "Inputs": [
      {
        "OperatorType": "Route",
        "Variant": "EqualUnique",
        "Keyspace": {
          "Name": "user",
          "Sharded": true
        },
        "FieldQuery": "select music.user_id from music where 1 != 1",
        "Query": "select music.user_id from music where music.id = 456",
        "Table": "music",
        "Values": [
          "INT64(456)"
        ],
        "Vindex": "music_user_map"
      },
      {
        "OperatorType": "Route",
        "Variant": "EqualUnique",
        "Keyspace": {
          "Name": "user",
          "Sharded": true
        },
        "FieldQuery": "select `user`.id from `user`, music_extra where 1 != 1",
        "Query": "select `user`.id from `user`, music_extra where `user`.id = 123 and music_extra.user_id = :music_user_id and `user`.id = music_extra.user_id",
        "Table": "`user`, music_extra",
        "Values": [
          ":music_user_id"
        ],
        "Vindex": "user_index"
      }
    ]
  },
  "TablesUsed": [
    "user.music",
    "user.music_extra",
    "user.user"
  ]
}

The Gen4 planner would only consider merging EqualUnique routes if the routes matched completely, and would immediately give up merging the routes if they didn't. Now, we try to also consider join predicates and merge routes based on them instead.

Related Issue(s)

Fixes the incorrect routing identified in https://github.com/vitessio/vitess/issues/10819, but does not fix the incorrect query generation (which is a separate issue).

Checklist

  • [x] "Backport me!" label has been added if this change should be backported
  • [x] Tests were added or are not required
  • [x] Documentation was added or is not required

arthurschreiber avatar Aug 04 '22 15:08 arthurschreiber

Review Checklist

Hello reviewers! :wave: Please follow this checklist when reviewing this Pull Request.

General

  • [x] Ensure that the Pull Request has a descriptive title.
  • [x] If this is a change that users need to know about, please apply the release notes (needs details) label so that merging is blocked unless the summary release notes document is included.
  • [x] If a new flag is being introduced, review whether it is really needed. The flag names should be clear and intuitive (as far as possible), and the flag's help should be descriptive.
  • [x] If a workflow is added or modified, each items in Jobs should be named in order to mark it as required. If the workflow should be required, the GitHub Admin should be notified.

Bug fixes

  • [x] There should be at least one unit or end-to-end test.
  • [x] The Pull Request description should either include a link to an issue that describes the bug OR an actual description of the bug and how to reproduce, along with a description of the fix.

Non-trivial changes

  • [x] There should be some code comments as to why things are implemented the way they are.

New/Existing features

  • [x] Should be documented, either by modifying the existing documentation or creating new documentation.
  • [x] New features should have a link to a feature request issue or an RFC that documents the use cases, corner cases and test cases.

Backward compatibility

  • [x] Protobuf changes should be wire-compatible.
  • [x] Changes to _vt tables and RPCs need to be backward compatible.
  • [x] vtctl command output order should be stable and awk-able.

vitess-bot[bot] avatar Aug 04 '22 15:08 vitess-bot[bot]

vtgate_tablet_healthcheck_cache was fixed in #10961. Totally unrelated, so we can merge this.

deepthi avatar Aug 16 '22 21:08 deepthi