sonic-buildimage icon indicating copy to clipboard operation
sonic-buildimage copied to clipboard

PFCWD drop+ok counters are not counted correctly.

Open amitpawar12 opened this issue 1 year ago • 5 comments

Description

Steps to reproduce the issue:

  • Single ingress and egress.
  • Enabled PFCWD config.
  • Send pause frames to egress at high rate causing PFC storm.
  • Send traffic to ingress.
  • DUT should be dropping traffic.
  • Ensure that PFCWD stats show that storm is detected.
  • Stop the PFC storm.
  • Ensure that PFCWD stats show that restored counter increments.
  • Check the drop counter on PFCWD stats should increment and correspond to the drop packets.

Describe the results you received:

  • The PFCWD drop counter does not increment at times and when it does, it does not correspond to the drop count.

Describe the results you expected:

  • PFCWD stats should increment the drop packets counter corresponding to the packets dropped during the storm.

Output of show version:

#### Log output:
Before:
admin@ixre-egl-board72:~$ show pfcwd stats
       QUEUE       STATUS    STORM DETECTED/RESTORED    TX OK/DROP    RX OK/DROP    TX LAST OK/DROP    RX LAST OK/DROP
------------  -----------  -------------------------  ------------  ------------  -----------------  -----------------
Ethernet24:3  operational                        2/2           0/0           0/0                0/0                0/0

After:
admin@ixre-egl-board72:~$ show pfcwd stats
       QUEUE       STATUS    STORM DETECTED/RESTORED    TX OK/DROP    RX OK/DROP    TX LAST OK/DROP    RX LAST OK/DROP
------------  -----------  -------------------------  ------------  ------------  -----------------  -----------------
Ethernet24:3  operational                        3/3           0/0           0/0                0/0                0/0

Dropcounter as per queue counter:
admin@ixre-egl-board72:~$ sudo ip netns exec asic0 show queue counter --nonzero
------ curtailed output -------
      Port    TxQ    Counter/pkts    Counter/bytes    Drop/pkts    Drop/bytes
----------  -----  --------------  ---------------  -----------  ------------
Ethernet24    UC3         126,789      127,803,312       73,211    73,796,688
Ethernet24    UC7               2              516            0             0
------ curtailed output -------

Tx and Rx packets received on interface:

admin@ixre-egl-board72:~$ show interface counter
Last cached time was 2024-05-23 21:44:22.787829
      IFACE    STATE    RX_OK       RX_BPS    RX_UTIL    RX_ERR    RX_DRP    RX_OVR    TX_OK       TX_BPS    TX_UTIL    TX_ERR    TX_DRP    TX_OVR
-----------  -------  -------  -----------  ---------  --------  --------  --------  -------  -----------  ---------  --------  --------  --------
  Ethernet0        U  100,000  6578.50 B/s      0.00%         0         0         0        3     5.17 B/s      0.00%         0         0         0
  Ethernet8        U  100,000  6578.50 B/s      0.00%         0         0         0        3     5.17 B/s      0.00%         0         0         0
 Ethernet16        U        0     0.00 B/s      0.00%         0         0         0        3     0.40 B/s      0.00%         0         0         0
 Ethernet24        U        0     9.80 B/s      0.00%         0         0         0  126,792  8341.21 B/s      0.00%         0         0         0

(paste your output here or download and attach the file here )

Additional information you deem important (e.g. issue happens only occasionally):

amitpawar12 avatar May 24 '24 13:05 amitpawar12

This is nokia chassis. Issue not always reproducible. @amitpawar12 can you please update is this issue is still happening or not

arlakshm avatar Jun 05 '24 15:06 arlakshm

The issue is seen with Nokia VOQ chassis (using MSFT 2205 branch).

The test was done with Snappi framework using IXIA tool.

Further update: Tested with single prio traffic sent from 2 ingresses and pause on the egress interface of the DUT.

  1. With 2 ingresses and 1 egress, if the configured line-rate for the lossless prio is more than 50%, then packets are dropped on the ingress interface itself and PFCWD counters do not increment at all.
  2. If the configured line-rate for the lossless prio per interface is less than 50%, then we do see the PFCWD drop counters incrementing. However, at the end of the test, drop counter do not match with drop counter reported by IXIA and by queue counters on DUT.

amitpawar12 avatar Jun 05 '24 15:06 amitpawar12

@amitpawar12 lets discuss this issue in Monday meeting.

vmittal-msft avatar Aug 02 '24 17:08 vmittal-msft

@saksarav-nokia is checking more on this.

abdosi avatar Aug 28 '24 17:08 abdosi

@vmittal-msft , This is what we found with our debugging:

Please find the attached files for two different runs of the PFC WD test with syslogs (added a syslog in commitCounters function in orchagent pfcwd code to print the hw counters), COUNTERS output from counters_db for egress port's queue which is stormed and also the show pfcwd stats and show queue counter output.

When the PFC storm is detected, the orchagent installs the ingress and egress Acl rules and start the doTask which calls commitCounters every 1 sec (COUNTER_CHECK_POLL_TIMEOUT_SEC). This functions calls getHwCounters which reads the Queue counters (SAI_QUEUE_STAT_DROPPED_BYTES & SAI_QUEUE_STAT_DROPPED_PACKETS) from SAI/HW (pls note, it reads from SAI and not from counter_db) and then in updateWdCounters, PFC_WD_QUEUE_STATS_TX_DROPPED_PACKETS & PFC_WD_QUEUE_STATS_TX_DROPPED_PACKETS_LAST are updated in counters_db with the values read from SAI. As soon as the storm is restored, the orchagent removes ingress and egress Acl rules and sets the handler function to null_ptr, so the commitCounters is not called anymore. But as per the log, the acl rule gets removed right after the last commitCounters is called. So not all the dropped packets are read from SAI/hw by last commitCounters and updated to PFC_WD_QUEUE_STATS_TX_DROPPED_PACKETS in the counters_db.

Remember the SAI_QUEUE_STAT_DROPPED_BYTES & SAI_QUEUE_STAT_DROPPED_PACKETS are also part of FLEX_COUNTER_TABLE:QUEUE_STAT_COUNTER in flex_counter_db with "POLL_INTERVAL": "10000". So the Queue counters get updated by the flex counter polling and hence the show queue counter shows the correct counter values. Also we can see the correct values in COUNTER_DB output for the stormed queue in the attached logs for both the runs. The queue drop counter values in hardware, SAI_QUEUE_STAT_DROPPED_BYTES & SAI_QUEUE_STAT_DROPPED_PACKETS in counters_db and show queue counter output match to the loss reported by Ixia.
So this is due to orchagent not updating the PFC_WD_QUEUE_STATS_TX_DROPPED_PACKETS for short duration after the storm is restored . Since the show pfcwd stats uses PFC_WD_QUEUE_STATS_TX_DROPPED_PACKETS from counters_db, the drop counter value is less than the actual drop in hardware. Let's discuss to see how we want to pfc_wd_cntr_1st_run.txt pfc_wd_cntr_2nd_run.txt resolve this

saksarav-nokia avatar Sep 03 '24 16:09 saksarav-nokia

Hi @vmittal, @amitpawar12 is this issue applicable to 202405 as well?

arlakshm avatar Oct 25 '24 21:10 arlakshm

This issue may to seen on all sonic platform not specific to voq chassis. @vmittal-msft to follow-up

arlakshm avatar Nov 13 '24 18:11 arlakshm

@vmittal-msft @arlakshm When PFCWD storm is restored, the function PfcWdSwOrch<DropHandler, ForwardHandler>::startWdActionOnQueue does the following entry->second.handler->commitCounters(); entry->second.handler = nullptr;

it calls (1) commitCounters which reads the counters from SAI and (2) sets the handler function pointer to nullptr which calls the destructor PfcWdAclHandler::~PfcWdAclHandler . The destructor removes the Acl Rules installed. Since the entry->second.handler is destructed, the commitCounters will not be called anymore. So the packets received from the time the last commitCounters was called and till the Acl rule is deleted, are not counted for show pfcwd commands. i don't see an easy fix to address this. Please let me know if you have any other suggestion.

saksarav-nokia avatar Dec 10 '24 20:12 saksarav-nokia

1/8: Nokia updated that issue still there

rlhui avatar Jan 08 '25 18:01 rlhui

@prsunny, @kperumalbfn this seems like a generic issue. Can you check

arlakshm avatar Mar 26 '25 17:03 arlakshm

@developfast , can you repro it on T0/T1 topo and share the feedback here?

prsunny avatar Mar 27 '25 20:03 prsunny

@developfast , could you update the findings for this issue?

prsunny avatar Apr 03 '25 20:04 prsunny

@developfast, can you please lets us know if this issue is seen on T0/T1 as well?

arlakshm avatar Apr 23 '25 17:04 arlakshm

No have not seen this issue on T0/T1 — checked on 7050CX3 and noticed drop counters almost matched (diff of maybe 10 packets) the tgen counters.

developfast avatar Apr 23 '25 17:04 developfast

@vmittal-msft to follow-up.

abdosi avatar May 21 '25 17:05 abdosi

@saksarav-nokia @amitpawar12, lets check this on 202405 release and see if we can find a fix for this. lets setup a time to check this.

vmittal-msft avatar Jun 17 '25 17:06 vmittal-msft

Discussed this issue with @saksarav-nokia @amitpawar12. Requested some more data from experiments as code review is not pointing to an obvious problem.

  1. PFCWD stats is supposed to get counters for the duration PFCWD was active and not before or after.
  2. Queue stats are more comprehensive where in addition to PFCWD duration, it may count before/after packets as well.

So, we need to know if the traffic was continuous in this test or was stopped before PFCWD storm was lifted. This will help in determine if this is real problem or not. Sakthi/Amit will get back by early next week.

vmittal-msft avatar Jul 10 '25 19:07 vmittal-msft

@vmittal-msft @arlakshm When PFCWD storm is restored, the function PfcWdSwOrch<DropHandler, ForwardHandler>::startWdActionOnQueue does the following entry->second.handler->commitCounters(); entry->second.handler = nullptr;

it calls (1) commitCounters which reads the counters from SAI and (2) sets the handler function pointer to nullptr which calls the destructor PfcWdAclHandler::~PfcWdAclHandler . The destructor removes the Acl Rules installed. Since the entry->second.handler is destructed, the commitCounters will not be called anymore. So the packets received from the time the last commitCounters was called and till the Acl rule is deleted, are not counted for show pfcwd commands. i don't see an easy fix to address this. Please let me know if you have any other suggestion.

Hi @vmittal-msft , i checked the code again and the reason for seeing the inconsistent drop counter values in "show int counter" and "show pfcwd stats" is same as what i mentioned in the quoted comment. When storm is detected, PfcWdSwOrch<DropHandler, ForwardHandler>::startWdActionOnQueue creates the handler with entry->second.handler = make_shared<DropHandler> which is nothing but PfcWdAclHandler. When the storm is restored, the following are done
entry->second.handler->commitCounters(); -----> reads the counters from SAI/hardware and updates pfcwd stats table. entry->second.handler = nullptr; ---> This invokes the destructor PfcWdAclHandler which removes the Acl rules.
So the Acl rule is removed after the counters are read and updated to PFC WD stats table and also the handler is set to nullptr. So the stats are not updated to PFCWD stats table anymore after the handler is set to nullptr. However since the Acl rules are removed after that, the packets dropped during that period( the counters are read before handler is set to null and Acl rule is removed ), they are not counted towards the PFWD stats . Since the flex counter manager keeps reading interface counters /queue counters and updates the counter_db, the show int counter shows the drop counters correctly.

saksarav-nokia avatar Jul 14 '25 20:07 saksarav-nokia

This was tested with four possible scenarios:

  • Traffic starts, pause starts <-> traffic stops, pause stops.
  • Pause starts, traffic starts <-> traffic stops, pause stops.
  • Traffic starts, pause starts <-> pause stops, traffic stops.
  • Pause starts, traffic starts <-> pause stops, traffic stops.

This issue is SEEN only when pause stops before the traffic stops. The interface and "show queue counters" update the drop counters but "show pfcwd stats" has discrepancy for the drops parameter.

Attached results.

PFCWD_stats.docx

amitpawar12 avatar Jul 14 '25 20:07 amitpawar12

Thanks for help. Please update title from "PFCWD ok+drop" to PFCWD drop as it is impacting drop counters only. Also, will keep it as low priority since we have other commands which can help check the actual drop caused by PFCWD storm time.

vmittal-msft avatar Jul 14 '25 22:07 vmittal-msft

Discussed potential solution of this issue with Sakthi. @saksarav-nokia @amitpawar12 will get back. Image

vmittal-msft avatar Jul 15 '25 17:07 vmittal-msft

Discussed potential solution of this issue with Sakthi. @saksarav-nokia @amitpawar12 will get back. Image

We tried the fix and did not help. We will debug further

saksarav-nokia avatar Jul 21 '25 13:07 saksarav-nokia

@vmittal-msft , we debugged the issue further and here is what we found:

The PfcWdSwOrch polls the PFCWD Queue drop counters every COUNTER_CHECK_POLL_TIMEOUT_SEC ( 1sec) by calling commitCounters which reads from SAI. The SAI reads from SDK software cache and not directly from hardware.

But in BCM DNX SAI, the Egress Queue counters are CRPS counters , not mac counters. So the crps counters are not polled by bcmCNTR thread which is controlled by the soc variable bcm_stat_interval. The crps counters are polled by crps polling and SAI sets this to 250us and since there are 16K crps counters supported, it takes atleast 2 secs to update the Queue counters from hardware to SDK software crps database.

We tried reducing this polling interval to 100us and this updates the software cache with in 1 secs and every time orchagent pfcwd reads the Queue counters via commitCounters, it gets the updated counter value. Reducing this to 100us might have impact on cpu usage.

However even with your new suggested fix (calling commitCounters in PfcWdAclHandler destructor), orchagent pfcwd reads the counters from SAI immediately after Acl is removed, but the crps polling did not update the counters from hardware to software cache yet. So unless we read the counters atleast sec or two after the Acl rule is removed, we will not get the correct counters in the PFC WD stats table.

"show int counter" shows the correct value and this is because the flex counter manager polls these queue counter every 10sec and the next flex counter polling after the Acl removed will get the correct values.

Let me know if you want to discuss this

saksarav-nokia avatar Jul 23 '25 20:07 saksarav-nokia

Thanks for your investigation @saksarav-nokia . As difference in interface counter Vs pfcwd counter is not too high, it seems there is not much value in fixing this issue. Also, the fix is complicated as well which can cause other issues. And we have an alternate way of checking drop counts due to PFCWD event via interface counter. I suggest we document this behavior and close this issue for now.

vmittal-msft avatar Jul 24 '25 21:07 vmittal-msft

Closing this issue based on agreement with @vmittal-msft .

Thanks.

amitpawar12 avatar Jul 30 '25 17:07 amitpawar12