vitess icon indicating copy to clipboard operation
vitess copied to clipboard

Force PK index usage in copy table vstreamer and vdiff queries

Open mattlord opened this issue 3 years ago • 3 comments

Description

In the vreplication vstreamer and vdiff components we rely on the primary key for uniqueness and ordering. It's possible, however, that the optimizer may choose a secondary index for the queries used to read rows from the source (invalid index stats, etc).

This PK index will always be the best option — when available — given the current ORDER BY PK clauses combined with the fact that we typically select all of the columns. Given that InnoDB uses a clustered index for the PK (index organized tables) this means that we can do a covering index scan and avoid the need for a filesort when using the PK.

To avoid the potential for using a secondary index for any reason when an explicit PK is available, we add a force index (PRIMARY) clause to the SELECT queries used on the source for row streaming (copy_table) vstreams and on the source and target for VDiffs.

⚠️ NOTE: this work excludes vstreams unrelated to RowStreamer (for copying tables) and should have no impact on materialization and messaging as the optimal key will vary for those streams.

Related Issue(s)

  • ???

Checklist

  • [x] Should this PR be backported? NO
  • [ ] Tests were added or are not required
  • [ ] Documentation was added or is not required

mattlord avatar Mar 08 '22 19:03 mattlord

Messaging uses vstream under the hood, and we specifically want it to use a secondary index. Would this change impact how this executes?

Query setup https://github.com/vitessio/vitess/blob/main/go/vt/vttablet/tabletserver/messager/message_manager.go#L254-L256

Vstream execution https://github.com/vitessio/vitess/blob/main/go/vt/vttablet/tabletserver/messager/message_manager.go#L910-L921

derekperkins avatar Mar 08 '22 20:03 derekperkins

Hi @derekperkins,

Messaging uses vstream under the hood, and we specifically want it to use a secondary index. Would this change impact how this executes?

Query setup https://github.com/vitessio/vitess/blob/main/go/vt/vttablet/tabletserver/messager/message_manager.go#L254-L256

Vstream execution https://github.com/vitessio/vitess/blob/main/go/vt/vttablet/tabletserver/messager/message_manager.go#L910-L921

Good question! Thank you for that. I will have to be sure that we are not forcing the PK index for messaging and for materialization.

mattlord avatar Mar 08 '22 21:03 mattlord

This PR is being marked as stale because it has been open for 30 days with no activity. To rectify, you may do any of the following:

  • Push additional commits to the associated branch.
  • Remove the stale label.
  • Add a comment indicating why it is not stale.

If no action is taken within 7 days, this PR will be closed.

github-actions[bot] avatar Aug 07 '22 01:08 github-actions[bot]

This PR was closed because it has been stale for 7 days with no activity.

github-actions[bot] avatar Aug 14 '22 01:08 github-actions[bot]