br icon indicating copy to clipboard operation
br copied to clipboard

br backup/restore retry is too long when there is S3 outage

Open fubinzh opened this issue 3 years ago • 0 comments

Please answer these questions before submitting your issue. Thanks!

  1. What did you do? If possible, provide a recipe for reproducing the error.
  • Run br restore from S3
  • While restore is in progress, simulate S3 outage
  • Check for how long br will retry before it fails
  1. What did you expect to see?
  • Retry should not be too long, 1 - 3 minutes of retry should be enough per PM's suggestion.
  1. What did you see instead?
  • Currently br might keep retrying for about 8-9 miniutes.
  1. What version of BR and TiDB/TiKV/PD are you using?

br: v5.1.0

  1. Operation logs

    • Please upload br.log for BR if possible
    • Please upload tidb-lightning.log for TiDB-Lightning if possible
    • Please upload tikv-importer.log from TiKV-Importer if possible
    • Other interesting logs
  2. Configuration of the cluster and the task

    • tidb-lightning.toml for TiDB-Lightning if possible
    • tikv-importer.toml for TiKV-Importer if possible
    • topology.yml if deployed by TiUP
  3. Screenshot/exported-PDF of Grafana dashboard or metrics' graph in Prometheus if possible

fubinzh avatar Jul 01 '21 07:07 fubinzh