nginx_upstream_check_module
nginx_upstream_check_module copied to clipboard
Unexpected false negatives
I use http-type check to trace health of upstreams. Sometimes module wrongly marks upstream as failed although it gets 200 OK. By checking dump I noticed the following things:
- if check is marked as failed (recorded to error.log as
check time out with peer) (even it gets 200 OK from remote host) then nginx sends RST immediately after getting reply - if check is marked as passed then normal session close is happened (with FIN/ACK)
- if I lower number of nginx workers from auto (40) to 5-10 then false negatives become very rare
- if I raise timeout from 2-3 seconds to 20-30 seconds then false negatives become very rare too
Does each nginx worker run its own checks for upstream(s) or there's one 'process' which manages these checks?
OK, I will check this problem.