sonic-swss
sonic-swss copied to clipboard
[Mellanox] PFC watchdog long term solution to reduce false alarm
What I did
Adjust PFC watchdog detection algorithm to reduce false alarms.
In the old PFC watchdog detection algorithm, the PFC watchdog can be triggered if either of the following conditions is satisfied in a detection interval:
- There are packets accumulated in the queue && there is no packet sent out of the queue && there are PFC frames received in the queue
- There is no packet accumulated in the queue and there are PFC frames received and blocking more than 80% of the detection interval.
The new PFC watchdog detection algorithm merges two conditions into one: The PFC watchdog is triggered only if:
- There are packets accumulated in the queue && there is no packet sent out of the queue && there are PFC frames received and blocking more than 99% of the detection interval.
Signed-off-by: Stephen Sun [email protected]
Why I did it
There are some rare scenarios in which the PFC watchdog can be mistriggered
- No PFC frames were received until the last moment of the polling period
- In accurate polling interval due to unexpected delay introduced after counter-polling in the previous polling interval and before counter-polling in the next polling interval
- In accurate polling interval due to unexpected delay introduced after counter-polling in a polling interval and before the lua script invoking in the same polling interval
Scenarios 1 and 2 are addressed in this PR.
How I verified it
Run PFC watchdog regression test with background traffic.
Details if related