public-roadmap icon indicating copy to clipboard operation
public-roadmap copied to clipboard

Double check on degradations

Open Alex-shved opened this issue 1 year ago • 1 comments

Is your feature request related to a problem? We have the degradation check set to 2 seconds. Over the past few months, the number of false positives from checkly has begun to rise. Quite often, triggers began to occur when the query execution time was more than 2 seconds due to delays in the DNS and TCP. The problem with DNS is probably related to the coincidence of the time of the test request and the time of resetting the DNS cache in the AWS. If I'm not mistaken, it is their DNS that is used to resolve names for queries from checkly. DNS

The issue with the TCP is probably related to the subsidence on the network. Sometimes a request from a checker to our service takes much longer than usual ~2-4 seconds, this demonstrates problems on the network, but not problems in the operation of the service itself. TCP

The issue with waiting at the start of the connection did not arise; like both of those described above, it refers to "CONNECTION START"

The 3 described points most often do not relate to the work of the tested service, and create a distortion of statistics.

Describe the solution you'd like Implement a re-check in case of degradation triggering. Similarly, as implemented in cases of failure.

Describe alternatives you've considered As an alternative, I've tried deactivating the degradation alert level by setting the failed check's trigger time to be less than the degradation's trigger time. In this situation, I was able to get rid of the problems associated with DNS and TCP, but any slowdown in the service is recognized as failing, which in turn also distorts the statistics and requires an additional study of the failed check I would like to have data on degradation and failed checks

Alex-shved avatar May 24 '23 22:05 Alex-shved

@Alex-shved sorry for the late reply!

  • we are looking at overhauling how you can alert and retry coming quarter. So this feedback is noted.
  • we just move our public roadmap to https://feedback.checklyhq.com/ and are archiving this repo. If you want, you can resubmit it over at https://feedback.checklyhq.com/

tnolet avatar Jun 02 '23 08:06 tnolet