monitoring
monitoring copied to clipboard
Adjust the duration threshold for the alerts dependent on disk latency
trafficstars
Alerts that depend on disk latency are too sensitive for public cloud resources, and keeps triggering for each disk latency fluctuation. The following is the list of alerts that should be adjusted:
- TiKV_async_request_write_duration_seconds
- TiKV_scheduler_command_duration_seconds
- TiKV_scheduler_latch_wait_duration_seconds
As per discussion with @tennix, we'll adjust the "for" clause from 1m to 5m for these alerts to prevent excessive false alarms in public cloud environment.