dm
dm copied to clipboard
Support syncing when disk is very slow
Feature Request
Is your feature request related to a problem? Please describe:
A user deploy DM in an environment that has very slow disk. that has help revealing some BUG of DM such as
- [ ] https://github.com/pingcap/dm/issues/1377
- [ ] worker recieved a bound watch, but failed to read bound information in etcd and didn't retry or kill itself
- [ ]
query-statusshows nothing, while can't add task because ofalready exists(not enough information in log)
Describe the feature you'd like:
- [ ] expose more etcd error and metrics (already in https://github.com/pingcap/dm/issues/1219, https://github.com/pingcap/dm/issues/1218), and warn when disk is bad
- [ ] test DM in slow disk
Describe alternatives you've considered:
Teachability, Documentation, Adoption, Migration Strategy:
I have used chaosmesh to try imitate a bad disk environment, but not effective to reveal bugs. We might try use failpoint with percent probability to inject into etcd API, and check if it will cause inconsistency in DM.
@zeminzhou
(removed the BUG label because we need further investigating if it's has been fixed)