sonic-swss
sonic-swss copied to clipboard
[PFCWD + Asym PFC]: Allow PFCWD to detect PFC storm on all priorities when Asym PFC is enabled
Signed-off-by: Volodymyr Samotiy [email protected]
What I did Add possibility for PFCWD to detect PFC storm on all priorities when asymmetric PFC is enabled.
According to requirements PFCWD should detect PFC storm on the lossy priorities for the port on which asymmetric PFC is enabled. (when asymmetric PFC is enabled on some port that means such port is enabled for receiving PFC frames on all priorities)
Why I did it To fix issue https://github.com/Azure/sonic-swss/issues/1128.
How I verified it
- Enable asymmetric PFC on test port.
pfc config asymmetric on <port>
- Enable PFCWD on the test port.
pfcwd start --action drop --restoration-time 3000 <port> 1000
pfcwd interval 500
- Run
pause
frames to the test port for all priorities. - Verify that PFCWD detected storm for all priorities.
pfcwd show stats
Details if related N/A
retest vs please
retest this please
retest this please
retest this please
retest this please
retest this please
retest this please
@volodymyrsamotiy please handle conflicts
@volodymyrsamotiy please handle conflicts
handled
@neethajohn kindly reminder. this PR should go to 202012 as well.
@neethajohn Kindly Reminder, we are waiting for the merge of this PR. Thanks.
@neethajohn - Who can merge this PR ? Thanks.
This pull request fixes 1 alert when merging 53cf04ee65bfee395c876fca3459a6f066a961ce into c37cc1c56336d242fdcc3e728f8e86be992dd382 - view on LGTM.com
fixed alerts:
- 1 for Unused import
Please add unit tests
@neethajohn done. Please review
This pull request fixes 1 alert when merging a946c21d481797b23683acfc84e969cef6e57841 into 15a014bf666916a253053f9419522e6b8e8e20f3 - view on LGTM.com
fixed alerts:
- 1 for Unused import
@neethajohn kindly reminder to approve this PR. once approved we will issue a new one against 202012 as it is not cleanly cherry picked.
@volodymyrsamotiy , I think these unit tests are not sufficient to ensure that pfcwd feature is working correctly for asym mode and in combined mode. I am concerned that there might be scenarios we miss and end up causing regression. Can we add unit tests for the following flow in both combined and asym mode?
- start pfcwd
- verify the correct set of counters are added in flex counter for polling
- in asic db, verify that the pfc mask is getting disabled for the right queue/queues in storm
- stop pfcwd, in asic db, verify that pfc mask is enabled for the right queue
Also are the current pfcwd and asym pfc testcases in sonic-mgmt passing with this change?
@neethajohn, I agree that unit test checks only simple logic but that is because we are pretty limited on virtual switch. Currently we don't have any VS test for PFCWD and it was never checked on VS so maybe even some additional VS infra will required. A while ago I already tried it on virtual switch and PFCWD logic is not fully functional on it, so test checks everything we currently can check. In any case I will think about it more and check what else we can do. Regarding "sonic-mgmt" test, I will first fix latest comments for the "tx mask", then verify and let you know.
@volodymyrsamotiy, No vs test for pfcwd is making it difficult to ensure that there are no regressions for any changes/features that are getting added in this area. As far as I can think of, to verify the flow that I mentioned earlier, we need to set the 'pfc_enable' in PORT_QOS_MAP, enable flex counter for pfcwd, create pfcwd table for the port under test with the detection/restoration intervals. For mocking the storm, there is 'DEBUG_STORM' attr which can be set for the queue that needs to be in storm in COUNTERS_DB.
This pull request fixes 1 alert when merging 48375340849e8fb97bdb0389f0965cc47a2a8d8c into 8cf355db09fc2ffdadcea936c349301193a760da - view on LGTM.com
fixed alerts:
- 1 for Unused import
This pull request fixes 1 alert when merging 9586b6f8395ae1391d05362db9830b67b2f6525e into 8cf355db09fc2ffdadcea936c349301193a760da - view on LGTM.com
fixed alerts:
- 1 for Unused import
@neethajohn do you have additional comments or you can sign off?
@neethajohn do you have additional comments or you can sign off?
@liat-grozovik , I don't see my comments regarding vs tests addressed. In the last review I did, there were issues in the logic that could have been caught by the vs test flow I had suggested. Also I didn't hear back on whether the current pfcwd and asym pfc testcases in sonic-mgmt are passing with this change.
@neethajohn, please find summary below, just to have it in one place since now it is hard to find all the details:
- Current tests in sonic-mgmt are passing with these changes.
- As I mentioned previously PFCWD is not fully functional on the VS, so what you suggested currently is not working. I am aware of a possibility to start DEBUG storm via Redis DB but it doesn't help. And I tried the same approach as was suggested long time ago and double checked recently but storm cannot be simulated on VS like that. The reason for that is that even if you set DEBUG storm to true via Redis DB it is still handled by LUA script which then notifies orhcagent about the storm: e.g. https://github.com/Azure/sonic-swss/blob/master/orchagent/pfc_detect_mellanox.lua#L58. And as we know all the LUA scripts are vendor specific and started by orchagent based on the platform type. For the VS platform we don't have neither script nor handler in orchagent, so LUA is not started and no one handles DEBUG storm. I also had an idea to use "fake_platform" option in a test in order to run VS as faked real platform so script can be started, but it didn't help as well. I believe it is not simple to do, having PFCWD tests on VS will require more investigation, designing test infra and then implementing some additional changes. Which I think should be taken as a separate task. Please let me know what you think about it.
@volodymyrsamotiy , vs test infra support added in #2077. I believe that should make it possible to add unit tests in this area
@neethajohn any further comments or we can close this one?
@neethajohn following an offline discussion regarding tests, we agree that what we have the tests based on the existing infra and if extending the infra is required for pfc wd it should be discussed regardless of this PR. There is a sonic-mgmt test to cover this issue. As of that, could you please review and let us know if there are any additional outstanding issues that this PR should address?
This pull request fixes 1 alert when merging 1e001f57eaf31c9630095c3d17d5efc35fbec3a1 into a4c80c3d8713d32b73d9428683b8670b46d65fcd - view on LGTM.com
fixed alerts:
- 1 for Unused import
This pull request fixes 1 alert when merging d710235a3a8f51291bc4c1726f3b421ed0ac3a86 into a4c80c3d8713d32b73d9428683b8670b46d65fcd - view on LGTM.com
fixed alerts:
- 1 for Unused import
/azpw run
/AzurePipelines run
Azure Pipelines successfully started running 1 pipeline(s).