sonic-buildimage
sonic-buildimage copied to clipboard
PFCWD drop+ok counters are not counted correctly.
Description
Steps to reproduce the issue:
- Single ingress and egress.
- Enabled PFCWD config.
- Send pause frames to egress at high rate causing PFC storm.
- Send traffic to ingress.
- DUT should be dropping traffic.
- Ensure that PFCWD stats show that storm is detected.
- Stop the PFC storm.
- Ensure that PFCWD stats show that restored counter increments.
- Check the drop counter on PFCWD stats should increment and correspond to the drop packets.
Describe the results you received:
- The PFCWD drop counter does not increment at times and when it does, it does not correspond to the drop count.
Describe the results you expected:
- PFCWD stats should increment the drop packets counter corresponding to the packets dropped during the storm.
Output of show version:
#### Log output:
Before:
admin@ixre-egl-board72:~$ show pfcwd stats
QUEUE STATUS STORM DETECTED/RESTORED TX OK/DROP RX OK/DROP TX LAST OK/DROP RX LAST OK/DROP
------------ ----------- ------------------------- ------------ ------------ ----------------- -----------------
Ethernet24:3 operational 2/2 0/0 0/0 0/0 0/0
After:
admin@ixre-egl-board72:~$ show pfcwd stats
QUEUE STATUS STORM DETECTED/RESTORED TX OK/DROP RX OK/DROP TX LAST OK/DROP RX LAST OK/DROP
------------ ----------- ------------------------- ------------ ------------ ----------------- -----------------
Ethernet24:3 operational 3/3 0/0 0/0 0/0 0/0
Dropcounter as per queue counter:
admin@ixre-egl-board72:~$ sudo ip netns exec asic0 show queue counter --nonzero
------ curtailed output -------
Port TxQ Counter/pkts Counter/bytes Drop/pkts Drop/bytes
---------- ----- -------------- --------------- ----------- ------------
Ethernet24 UC3 126,789 127,803,312 73,211 73,796,688
Ethernet24 UC7 2 516 0 0
------ curtailed output -------
Tx and Rx packets received on interface:
admin@ixre-egl-board72:~$ show interface counter
Last cached time was 2024-05-23 21:44:22.787829
IFACE STATE RX_OK RX_BPS RX_UTIL RX_ERR RX_DRP RX_OVR TX_OK TX_BPS TX_UTIL TX_ERR TX_DRP TX_OVR
----------- ------- ------- ----------- --------- -------- -------- -------- ------- ----------- --------- -------- -------- --------
Ethernet0 U 100,000 6578.50 B/s 0.00% 0 0 0 3 5.17 B/s 0.00% 0 0 0
Ethernet8 U 100,000 6578.50 B/s 0.00% 0 0 0 3 5.17 B/s 0.00% 0 0 0
Ethernet16 U 0 0.00 B/s 0.00% 0 0 0 3 0.40 B/s 0.00% 0 0 0
Ethernet24 U 0 9.80 B/s 0.00% 0 0 0 126,792 8341.21 B/s 0.00% 0 0 0
(paste your output here or download and attach the file here )
Additional information you deem important (e.g. issue happens only occasionally):
This is nokia chassis. Issue not always reproducible. @amitpawar12 can you please update is this issue is still happening or not
The issue is seen with Nokia VOQ chassis (using MSFT 2205 branch).
The test was done with Snappi framework using IXIA tool.
Further update: Tested with single prio traffic sent from 2 ingresses and pause on the egress interface of the DUT.
- With 2 ingresses and 1 egress, if the configured line-rate for the lossless prio is more than 50%, then packets are dropped on the ingress interface itself and PFCWD counters do not increment at all.
- If the configured line-rate for the lossless prio per interface is less than 50%, then we do see the PFCWD drop counters incrementing. However, at the end of the test, drop counter do not match with drop counter reported by IXIA and by queue counters on DUT.
@amitpawar12 lets discuss this issue in Monday meeting.
@saksarav-nokia is checking more on this.
@vmittal-msft , This is what we found with our debugging:
Please find the attached files for two different runs of the PFC WD test with syslogs (added a syslog in commitCounters function in orchagent pfcwd code to print the hw counters), COUNTERS output from counters_db for egress port's queue which is stormed and also the show pfcwd stats and show queue counter output.
When the PFC storm is detected, the orchagent installs the ingress and egress Acl rules and start the doTask which calls commitCounters every 1 sec (COUNTER_CHECK_POLL_TIMEOUT_SEC). This functions calls getHwCounters which reads the Queue counters (SAI_QUEUE_STAT_DROPPED_BYTES & SAI_QUEUE_STAT_DROPPED_PACKETS) from SAI/HW (pls note, it reads from SAI and not from counter_db) and then in updateWdCounters, PFC_WD_QUEUE_STATS_TX_DROPPED_PACKETS & PFC_WD_QUEUE_STATS_TX_DROPPED_PACKETS_LAST are updated in counters_db with the values read from SAI. As soon as the storm is restored, the orchagent removes ingress and egress Acl rules and sets the handler function to null_ptr, so the commitCounters is not called anymore. But as per the log, the acl rule gets removed right after the last commitCounters is called. So not all the dropped packets are read from SAI/hw by last commitCounters and updated to PFC_WD_QUEUE_STATS_TX_DROPPED_PACKETS in the counters_db.
Remember the SAI_QUEUE_STAT_DROPPED_BYTES & SAI_QUEUE_STAT_DROPPED_PACKETS are also part of FLEX_COUNTER_TABLE:QUEUE_STAT_COUNTER in flex_counter_db with "POLL_INTERVAL": "10000". So the Queue counters get updated by the flex counter polling and hence the show queue counter shows the correct counter values. Also we can see the correct values in COUNTER_DB output for the stormed queue in the attached logs for both the runs.
The queue drop counter values in hardware, SAI_QUEUE_STAT_DROPPED_BYTES & SAI_QUEUE_STAT_DROPPED_PACKETS in counters_db and show queue counter output match to the loss reported by Ixia.
So this is due to orchagent not updating the PFC_WD_QUEUE_STATS_TX_DROPPED_PACKETS for short duration after the storm is restored . Since the show pfcwd stats uses PFC_WD_QUEUE_STATS_TX_DROPPED_PACKETS from counters_db, the drop counter value is less than the actual drop in hardware.
Let's discuss to see how we want to
pfc_wd_cntr_1st_run.txt
pfc_wd_cntr_2nd_run.txt
resolve this
Hi @vmittal, @amitpawar12 is this issue applicable to 202405 as well?
This issue may to seen on all sonic platform not specific to voq chassis. @vmittal-msft to follow-up
@vmittal-msft @arlakshm When PFCWD storm is restored, the function PfcWdSwOrch<DropHandler, ForwardHandler>::startWdActionOnQueue does the following entry->second.handler->commitCounters(); entry->second.handler = nullptr;
it calls (1) commitCounters which reads the counters from SAI and (2) sets the handler function pointer to nullptr which calls the destructor PfcWdAclHandler::~PfcWdAclHandler . The destructor removes the Acl Rules installed. Since the entry->second.handler is destructed, the commitCounters will not be called anymore. So the packets received from the time the last commitCounters was called and till the Acl rule is deleted, are not counted for show pfcwd commands. i don't see an easy fix to address this. Please let me know if you have any other suggestion.
1/8: Nokia updated that issue still there
@prsunny, @kperumalbfn this seems like a generic issue. Can you check
@developfast , can you repro it on T0/T1 topo and share the feedback here?
@developfast , could you update the findings for this issue?
@developfast, can you please lets us know if this issue is seen on T0/T1 as well?
No have not seen this issue on T0/T1 — checked on 7050CX3 and noticed drop counters almost matched (diff of maybe 10 packets) the tgen counters.
@vmittal-msft to follow-up.
@saksarav-nokia @amitpawar12, lets check this on 202405 release and see if we can find a fix for this. lets setup a time to check this.
Discussed this issue with @saksarav-nokia @amitpawar12. Requested some more data from experiments as code review is not pointing to an obvious problem.
- PFCWD stats is supposed to get counters for the duration PFCWD was active and not before or after.
- Queue stats are more comprehensive where in addition to PFCWD duration, it may count before/after packets as well.
So, we need to know if the traffic was continuous in this test or was stopped before PFCWD storm was lifted. This will help in determine if this is real problem or not. Sakthi/Amit will get back by early next week.
@vmittal-msft @arlakshm When PFCWD storm is restored, the function PfcWdSwOrch<DropHandler, ForwardHandler>::startWdActionOnQueue does the following entry->second.handler->commitCounters(); entry->second.handler = nullptr;
it calls (1) commitCounters which reads the counters from SAI and (2) sets the handler function pointer to nullptr which calls the destructor PfcWdAclHandler::~PfcWdAclHandler . The destructor removes the Acl Rules installed. Since the entry->second.handler is destructed, the commitCounters will not be called anymore. So the packets received from the time the last commitCounters was called and till the Acl rule is deleted, are not counted for show pfcwd commands. i don't see an easy fix to address this. Please let me know if you have any other suggestion.
Hi @vmittal-msft , i checked the code again and the reason for seeing the inconsistent drop counter values in "show int counter" and "show pfcwd stats" is same as what i mentioned in the quoted comment. When storm is detected, PfcWdSwOrch<DropHandler, ForwardHandler>::startWdActionOnQueue creates the handler with
entry->second.handler = make_shared<DropHandler> which is nothing but PfcWdAclHandler.
When the storm is restored, the following are done
entry->second.handler->commitCounters(); -----> reads the counters from SAI/hardware and updates pfcwd stats table.
entry->second.handler = nullptr; ---> This invokes the destructor PfcWdAclHandler which removes the Acl rules.
So the Acl rule is removed after the counters are read and updated to PFC WD stats table and also the handler is set to nullptr.
So the stats are not updated to PFCWD stats table anymore after the handler is set to nullptr. However since the Acl rules are removed after that, the packets dropped during that period( the counters are read before handler is set to null and Acl rule is removed ), they are not counted towards the PFWD stats . Since the flex counter manager keeps reading interface counters /queue counters and updates the counter_db, the show int counter shows the drop counters correctly.
This was tested with four possible scenarios:
- Traffic starts, pause starts <-> traffic stops, pause stops.
- Pause starts, traffic starts <-> traffic stops, pause stops.
- Traffic starts, pause starts <-> pause stops, traffic stops.
- Pause starts, traffic starts <-> pause stops, traffic stops.
This issue is SEEN only when pause stops before the traffic stops. The interface and "show queue counters" update the drop counters but "show pfcwd stats" has discrepancy for the drops parameter.
Attached results.
Thanks for help. Please update title from "PFCWD ok+drop" to PFCWD drop as it is impacting drop counters only. Also, will keep it as low priority since we have other commands which can help check the actual drop caused by PFCWD storm time.
Discussed potential solution of this issue with Sakthi. @saksarav-nokia @amitpawar12 will get back.
Discussed potential solution of this issue with Sakthi. @saksarav-nokia @amitpawar12 will get back.
We tried the fix and did not help. We will debug further
@vmittal-msft , we debugged the issue further and here is what we found:
The PfcWdSwOrch polls the PFCWD Queue drop counters every COUNTER_CHECK_POLL_TIMEOUT_SEC ( 1sec) by calling commitCounters which reads from SAI. The SAI reads from SDK software cache and not directly from hardware.
But in BCM DNX SAI, the Egress Queue counters are CRPS counters , not mac counters. So the crps counters are not polled by bcmCNTR thread which is controlled by the soc variable bcm_stat_interval. The crps counters are polled by crps polling and SAI sets this to 250us and since there are 16K crps counters supported, it takes atleast 2 secs to update the Queue counters from hardware to SDK software crps database.
We tried reducing this polling interval to 100us and this updates the software cache with in 1 secs and every time orchagent pfcwd reads the Queue counters via commitCounters, it gets the updated counter value. Reducing this to 100us might have impact on cpu usage.
However even with your new suggested fix (calling commitCounters in PfcWdAclHandler destructor), orchagent pfcwd reads the counters from SAI immediately after Acl is removed, but the crps polling did not update the counters from hardware to software cache yet. So unless we read the counters atleast sec or two after the Acl rule is removed, we will not get the correct counters in the PFC WD stats table.
"show int counter" shows the correct value and this is because the flex counter manager polls these queue counter every 10sec and the next flex counter polling after the Acl removed will get the correct values.
Let me know if you want to discuss this
Thanks for your investigation @saksarav-nokia . As difference in interface counter Vs pfcwd counter is not too high, it seems there is not much value in fixing this issue. Also, the fix is complicated as well which can cause other issues. And we have an alternate way of checking drop counts due to PFCWD event via interface counter. I suggest we document this behavior and close this issue for now.
Closing this issue based on agreement with @vmittal-msft .
Thanks.
