pd
pd copied to clipboard
Alleviate the latency impact when reloading PD
Enhancement Task
Right now, when we need to reload PD, we need to switch the leader at least twice. The region's cache of PD will be invalid, and we must load them from the persisted storage. If the region synchronizes process is not finished, the leader information is lost since we don't persist it and that information must be accomplished through the region heartbeat, which might cause the GetRegion request to get an unexpected result. Finally, it will trigger the backoff mechanism which results in the latency increase.