vitess icon indicating copy to clipboard operation
vitess copied to clipboard

VReplication: pooling vstreamers on source tablets

Open rohit-nayak-ps opened this issue 4 years ago • 0 comments

Motivation

In VReplication workflows streams run on target shards pulling data from source shards. When the targets are heavily sharded there could be several concurrent streams across different target shards sourcing from the same source tablets.

Each target stream is associated with a single vstreamer running on the source tablet. Each vstreamer streams all the binlogs by impersonating a replica. This can cause a heavy load on the underlying mysql server: both in terms of the number of open replica connections and high cpu usage.

Goal

The aim is that multiple vstreamers share a single binlog stream where possible significantly reducing the number of replicas seen by the mysql server using a single producer, multiple consumer design. Each binlog event pulled by the producer is broadcast to all vstreamers subscribing to that binlog stream.

Design

Considerations

  • Load on the source tablets can affect query serving, by impacting qps if source tablets are primary and increasing replication lag if they are replicas
  • Each source event needs to process every binlog event and filter out events that are not meant for the target they are streaming to (using in_keyrange filters).
  • Each event (with exceptions) sent to the target gets converted to a DML query and applied on the server. Only after processing the event does the next event get sent to the target. Thus if a target is slow in applying the event it can lag behind other targets
  • For most deployments, streams on a single source tablet should have low lags (order of seconds). Hence the position of each should be reasonably close to each other.
  • MoveTables and Resharding workflows need to run to completion, meaning all targets need to sync up to a common position before cutting traffic over.
  • MoveTables and Resharding workflows are not in the query serving path, hence it should be fine to trade off time-to-completion of these workflows with a reduced load on the source tablets.
  • Certain Materialize streams may be expected to run near-realtime (like aggregations used for reporting)
  • Also it is possible that the source is not overloaded and a naive pooling strategy might unnecessarily slow down the workflows
  • A target may go down temporarily leading to streams stalling

Requirements

  • Allow streams with comparable lags to join the same producer
  • If a consumer starts straggling move it to another (or its own) producer
  • Disable pooling for certain workflows
  • Disable or minimize pooling where source is doing fine and/or we want vreplication to be prioritized

Visibility

It is important to have the following metrics available for monitoring, debugging and configuring this pool including:

  • number of producers
  • max/min/average number of consumers per producer
  • min/max/average wait times for consumers
  • current positions of various producers

Configuration parameters

  • allowed lag in terms of gtid distance for which a consumer can join a producer
  • producer pool size

Notes

  • This is orthogonal to the new throttling mechanism in VReplication. Throttling will slow down source streams based on replication lag of the sources. Pooling reduces the load (and thus reducing the amount of throttling experienced) due to multiple streams.

Implementation

<TODO>

rohit-nayak-ps avatar Feb 03 '21 13:02 rohit-nayak-ps