sonic-mgmt
sonic-mgmt copied to clipboard
Leverage the fanout's ASIC to send PFC storm
Currently, the PFC storm is sent using packet socket on the host CPU, which causes some drawbacks
- There is strict timing requirement, which can not be guaranteed
- The more ports under PFC storm, the higher probability that the timing requirement is not satisfied
We will leverage the fanout's ASIC to send PFC storm
- Use SDK API to enable lossless traffic, set xoff to 0, and setup mappings for the PGs and the corresponding IEEE priorities
After the change, the pfc related test case which is using onxy as fanout will send the pfc frame with ASIC instead of using the CPU. New docker storm docker need to be build with the Docker file provided in this PR.
Description of PR
Summary: Fixes # (issue)
Type of change
- [ ] Bug fix
- [ ] Testbed and Framework(new/improvement)
- [x] Test case(new/improvement)
Back port request
- [ ] 201911
- [ ] 202012
- [ ] 202205
- [x] 202305
- [x] 202311
- [x] 202405
Approach
What is the motivation for this PR?
To make the pfc related test case much more stable.
How did you do it?
Change sending pfc frame with CPU to ASIC on the onxy fanout.
How did you verify/test it?
Run the pfc related test case on the setup which has onxy fanout, test case could pass stablly.
Any platform specific information?
Supported testbed topology if it's a new test case?
Documentation
The pre-commit check detected issues in the files touched by this pull request. The pre-commit check is a mandatory check, please fix detected issues.
Detailed pre-commit check results:
trim trailing whitespace.................................................Passed
fix end of files.........................................................Passed
check yaml...........................................(no files to check)Skipped
check for added large files..............................................Passed
check python ast.........................................................Passed
flake8...................................................................Failed
- hook id: flake8
- exit code: 1
ansible/roles/test/files/mlnx/docker-tests-pfcgen-asic/pfc_gen.py:13:1: F403 'from python_sdk_api.sx_api import *' used; unable to detect undefined names
ansible/roles/test/files/mlnx/docker-tests-pfcgen-asic/pfc_gen.py:18:1: E302 expected 2 blank lines, found 1
ansible/roles/test/files/mlnx/docker-tests-pfcgen-asic/pfc_gen.py:21:1: E302 expected 2 blank lines, found 1
ansible/roles/test/files/mlnx/docker-tests-pfcgen-asic/pfc_gen.py:32:1: E302 expected 2 blank lines, found 1
ansible/roles/test/files/mlnx/docker-tests-pfcgen-asic/pfc_gen.py:33:10: F405 'sx_api_port_pfc_enable_set' may be undefined, or defined from star imports: python_sdk_api.sx_api
ansible/roles/test/files/mlnx/docker-tests-pfcgen-asic/pfc_gen.py:34:14: F405 'SX_STATUS_SUCCESS' may be undefined, or defined from star imports: python_sdk_api.sx_api
ansible/roles/test/files/mlnx/docker-tests-pfcgen-asic/pfc_gen.py:38:17: F405 'new_sx_cos_priority_t_arr' may be undefined, or defined from star imports: python_sdk_api.sx_api
ansible/roles/test/files/mlnx/docker-tests-pfcgen-asic/pfc_gen.py:39:15: F405 'new_sx_cos_ieee_prio_t_arr' may be undefined, or defined from star imports: python_sdk_api.sx_api
ansible/roles/test/files/mlnx/docker-tests-pfcgen-asic/pfc_gen.py:40:18: F405 'new_sx_cos_port_prio_buff_t_p' may be undefined, or defined from star imports: python_sdk_api.sx_api
ansible/roles/test/files/mlnx/docker-tests-pfcgen-asic/pfc_gen.py:42:1: E302 expected 2 blank lines, found 1
ansible/roles/test/files/mlnx/docker-tests-pfcgen-asic/pfc_gen.py:43:5: F405 'sx_cos_priority_t_arr_setitem' may be undefined, or defined from star imports: python_sdk_api.sx_api
...
[truncated extra lines, please run pre-commit locally to view full check results]
To run the pre-commit checks locally, you can follow below steps:
- Ensure that default python is python3. In sonic-mgmt docker container, default python is python2. You can run the check by activating the python3 virtual environment in sonic-mgmt docker container or outside of sonic-mgmt docker container.
- Ensure that the
pre-commitpackage is installed:
sudo pip install pre-commit
- Go to repository root folder
- Install the pre-commit hooks:
pre-commit install
- Use pre-commit to check staged file:
pre-commit
- Alternatively, you can check committed files using:
pre-commit run --from-ref <commit_id> --to-ref <commit_id>
The pre-commit check is mandatory. Can you help fix it?
@bingwang-ms This seems to be a big PR. FYI.
/azpw run
/AzurePipelines run
Azure Pipelines successfully started running 1 pipeline(s).
@kperumalbfn Can you please help review this PR?
@nhe-NV Does this change require the leaf-fanout switch running Onyx? If that's the case, the PR may not help us because we are not using Onyx as leaf-fanout OS. BTW, I didn't see code for building and importing the new container. Is that done manually?
The new PFCWD logic requires queue not empty. https://github.com/sonic-net/sonic-swss/pull/3036 How is that condition satisfied?
The new PFCWD logic requires queue not empty. sonic-net/sonic-swss#3036 How is that condition satisfied?
Hi @bingwang-ms the logic is implemented in https://github.com/sonic-net/sonic-mgmt/pull/12733. If run the test case with the PR #12733 only, and not include this PR, you will have the "queue not empty" requirement met, but due to the pfc frame is sent by the CPU, the traffic is not stable, and espicially for test_pfcwd_timer_accuracy, the test could pass with very low possiblity.
The new PFCWD logic requires queue not empty. sonic-net/sonic-swss#3036 How is that condition satisfied?
Hi @bingwang-ms the logic is implemented in #12733. If run the test case with the PR #12733 only, and not include this PR, you will have the "queue not empty" requirement met, but due to the pfc frame is sent by the CPU, the traffic is not stable, and espicially for test_pfcwd_timer_accuracy, the test could pass with very low possiblity.
I see. Thanks for the clarification. Then it makes sense to me. I missed #12733 as there is no cherry-pick request. I just added the cherry-pick label for it.
@nhe-NV Does this change require the leaf-fanout switch running Onyx? If that's the case, the PR may not help us because we are not using Onyx as leaf-fanout OS. BTW, I didn't see code for building and importing the new container. Is that done manually?
Can you please check this question as well?
Related PR for Non-Onyx leaf-fanout https://github.com/sonic-net/sonic-mgmt/pull/13768
@nhe-NV Does this change require the leaf-fanout switch running Onyx? If that's the case, the PR may not help us because we are not using Onyx as leaf-fanout OS. BTW, I didn't see code for building and importing the new container. Is that done manually?
Can you please check this question as well?
I see you found the answer for nononxy one: "Related PR for Non-Onyx leaf-fanout https://github.com/sonic-net/sonic-mgmt/pull/13768"
@nhe-NV Does this change require the leaf-fanout switch running Onyx? If that's the case, the PR may not help us because we are not using Onyx as leaf-fanout OS. BTW, I didn't see code for building and importing the new container. Is that done manually?
Can you please check this question as well? I see you have found the answer that "Related PR for Non-Onyx leaf-fanout https://github.com/sonic-net/sonic-mgmt/pull/13768"
What do you mean this" I didn't see code for building and importing the new container. Is that done manually?" In the Code, we do have the code to build the container.
@nhe-NV Does this change require the leaf-fanout switch running Onyx? If that's the case, the PR may not help us because we are not using Onyx as leaf-fanout OS. BTW, I didn't see code for building and importing the new container. Is that done manually?
Can you please check this question as well?
I see you found the answer for nononxy one: "Related PR for Non-Onyx leaf-fanout #13768"
@nhe-NV Does this change require the leaf-fanout switch running Onyx? If that's the case, the PR may not help us because we are not using Onyx as leaf-fanout OS. BTW, I didn't see code for building and importing the new container. Is that done manually?
Can you please check this question as well? I see you have found the answer that "Related PR for Non-Onyx leaf-fanout #13768"
What do you mean this" I didn't see code for building and importing the new container. Is that done manually?" In the Code, we do have the code to build the container.
@nhe-NV Does this change require the leaf-fanout switch running Onyx? If that's the case, the PR may not help us because we are not using Onyx as leaf-fanout OS. BTW, I didn't see code for building and importing the new container. Is that done manually?
Can you please check this question as well?
I see you found the answer for nononxy one: "Related PR for Non-Onyx leaf-fanout #13768"
@nhe-NV Does this change require the leaf-fanout switch running Onyx? If that's the case, the PR may not help us because we are not using Onyx as leaf-fanout OS. BTW, I didn't see code for building and importing the new container. Is that done manually?
Can you please check this question as well? I see you have found the answer that "Related PR for Non-Onyx leaf-fanout #13768"
What do you mean this" I didn't see code for building and importing the new container. Is that done manually?" In the Code, we do have the code to build the container. The container has the same name as prev one. so nonthing changed when using it.
@nhe-NV Does this change require the leaf-fanout switch running Onyx? If that's the case, the PR may not help us because we are not using Onyx as leaf-fanout OS. BTW, I didn't see code for building and importing the new container. Is that done manually?
Can you please check this question as well?
I see you found the answer for nononxy one: "Related PR for Non-Onyx leaf-fanout #13768"
@nhe-NV Does this change require the leaf-fanout switch running Onyx? If that's the case, the PR may not help us because we are not using Onyx as leaf-fanout OS. BTW, I didn't see code for building and importing the new container. Is that done manually?
Can you please check this question as well? I see you have found the answer that "Related PR for Non-Onyx leaf-fanout #13768"
What do you mean this" I didn't see code for building and importing the new container. Is that done manually?" In the Code, we do have the code to build the container.
I see. So the code for building the docker is not changed right? Can you please point me the code for building the docker?
@nhe-NV Does this change require the leaf-fanout switch running Onyx? If that's the case, the PR may not help us because we are not using Onyx as leaf-fanout OS. BTW, I didn't see code for building and importing the new container. Is that done manually?
Can you please check this question as well?
I see you found the answer for nononxy one: "Related PR for Non-Onyx leaf-fanout #13768"
@nhe-NV Does this change require the leaf-fanout switch running Onyx? If that's the case, the PR may not help us because we are not using Onyx as leaf-fanout OS. BTW, I didn't see code for building and importing the new container. Is that done manually?
Can you please check this question as well? I see you have found the answer that "Related PR for Non-Onyx leaf-fanout #13768"
What do you mean this" I didn't see code for building and importing the new container. Is that done manually?" In the Code, we do have the code to build the container.
I see. So the code for building the docker is not changed right? Can you please point me the code for building the docker?
Hi @bingwang-ms In this PR, you can find ansible/roles/test/files/mlnx/docker-tests-pfcgen-asic/Makefile, this is a new file for build the new docker. but the docker name that build is same as prev
@nhe-NV Does this change require the leaf-fanout switch running Onyx? If that's the case, the PR may not help us because we are not using Onyx as leaf-fanout OS. BTW, I didn't see code for building and importing the new container. Is that done manually?
Can you please check this question as well?
I see you found the answer for nononxy one: "Related PR for Non-Onyx leaf-fanout #13768"
@nhe-NV Does this change require the leaf-fanout switch running Onyx? If that's the case, the PR may not help us because we are not using Onyx as leaf-fanout OS. BTW, I didn't see code for building and importing the new container. Is that done manually?
Can you please check this question as well? I see you have found the answer that "Related PR for Non-Onyx leaf-fanout #13768"
What do you mean this" I didn't see code for building and importing the new container. Is that done manually?" In the Code, we do have the code to build the container.
I see. So the code for building the docker is not changed right? Can you please point me the code for building the docker?
Hi @bingwang-ms In this PR, you can find ansible/roles/test/files/mlnx/docker-tests-pfcgen-asic/Makefile, this is a new file for build the new docker. but the docker name that build is same as prev
Yes, I saw the makefile. But can you point me the code for building the new docker image? I didn't see that code in this PR.
@nhe-NV Does this change require the leaf-fanout switch running Onyx? If that's the case, the PR may not help us because we are not using Onyx as leaf-fanout OS. BTW, I didn't see code for building and importing the new container. Is that done manually?
Can you please check this question as well?
I see you found the answer for nononxy one: "Related PR for Non-Onyx leaf-fanout #13768"
@nhe-NV Does this change require the leaf-fanout switch running Onyx? If that's the case, the PR may not help us because we are not using Onyx as leaf-fanout OS. BTW, I didn't see code for building and importing the new container. Is that done manually?
Can you please check this question as well? I see you have found the answer that "Related PR for Non-Onyx leaf-fanout #13768"
What do you mean this" I didn't see code for building and importing the new container. Is that done manually?" In the Code, we do have the code to build the container.
I see. So the code for building the docker is not changed right? Can you please point me the code for building the docker?
Hi @bingwang-ms In this PR, you can find ansible/roles/test/files/mlnx/docker-tests-pfcgen-asic/Makefile, this is a new file for build the new docker. but the docker name that build is same as prev
Yes, I saw the makefile. But can you point me the code for building the new docker image? I didn't see that code in this PR.
Is this that you are looking for?
Hi @nhe-NV , I understand we have Makefile and Dockerfile. I'm looking for the code to call the make command or the docker build command as I didn't see that code in this change. Can you point it to me?
Hi @nhe-NV , I understand we have Makefile and Dockerfile. I'm looking for the code to call the
makecommand or thedocker buildcommand as I didn't see that code in this change. Can you point it to me?
Hi @bingwang-ms , same as prev pfc_wd docker, it dose not have the build command in the script, user will build their own docker and save it locally on demand. In the relevent test case, it will use it directly
I see. Thanks for the clarification. Please address the conflict.
The pre-commit check detected issues in the files touched by this pull request. The pre-commit check is a mandatory check, please fix detected issues.
Detailed pre-commit check results:
trim trailing whitespace.................................................Passed
fix end of files.........................................................Passed
check yaml...........................................(no files to check)Skipped
check for added large files..............................................Passed
check python ast.........................................................Passed
flake8...................................................................Failed
- hook id: flake8
- exit code: 1
ansible/roles/test/files/mlnx/docker-tests-pfcgen-asic/pfc_gen.py:119:18: F405 'new_uint32_t_p' may be undefined, or defined from star imports: python_sdk_api.sx_api
ansible/roles/test/files/mlnx/docker-tests-pfcgen-asic/pfc_gen.py:120:5: F405 'uint32_t_p_assign' may be undefined, or defined from star imports: python_sdk_api.sx_api
ansible/roles/test/files/mlnx/docker-tests-pfcgen-asic/pfc_gen.py:121:28: F405 'new_sx_port_attributes_t_arr' may be undefined, or defined from star imports: python_sdk_api.sx_api
ansible/roles/test/files/mlnx/docker-tests-pfcgen-asic/pfc_gen.py:122:10: F405 'sx_api_port_device_get' may be undefined, or defined from star imports: python_sdk_api.sx_api
ansible/roles/test/files/mlnx/docker-tests-pfcgen-asic/pfc_gen.py:123:14: F405 'SX_STATUS_SUCCESS' may be undefined, or defined from star imports: python_sdk_api.sx_api
ansible/roles/test/files/mlnx/docker-tests-pfcgen-asic/pfc_gen.py:126:16: F405 'uint32_t_p_value' may be undefined, or defined from star imports: python_sdk_api.sx_api
ansible/roles/test/files/mlnx/docker-tests-pfcgen-asic/pfc_gen.py:129:28: F405 'new_sx_port_attributes_t_arr' may be undefined, or defined from star imports: python_sdk_api.sx_api
ansible/roles/test/files/mlnx/docker-tests-pfcgen-asic/pfc_gen.py:130:10: F405 'sx_api_port_device_get' may be undefined, or defined from star imports: python_sdk_api.sx_api
ansible/roles/test/files/mlnx/docker-tests-pfcgen-asic/pfc_gen.py:131:15: F405 'SX_STATUS_SUCCESS' may be undefined, or defined from star imports: python_sdk_api.sx_api
ansible/roles/test/files/mlnx/docker-tests-pfcgen-asic/pfc_gen.py:141:27: F405 'sx_port_attributes_t_arr_getitem' may be undefined, or defined from star imports: python_sdk_api.sx_api
ansible/roles/test/files/mlnx/docker-tests-pfcgen-asic/pfc_gen.py:156:27: F405 'sx_port_attributes_t_arr_getitem' may be undefined, or defined from star imports: python_sdk_api.sx_api
...
[truncated extra lines, please run pre-commit locally to view full check results]
To run the pre-commit checks locally, you can follow below steps:
- Ensure that default python is python3. In sonic-mgmt docker container, default python is python2. You can run the check by activating the python3 virtual environment in sonic-mgmt docker container or outside of sonic-mgmt docker container.
- Ensure that the
pre-commitpackage is installed:
sudo pip install pre-commit
- Go to repository root folder
- Install the pre-commit hooks:
pre-commit install
- Use pre-commit to check staged file:
pre-commit
- Alternatively, you can check committed files using:
pre-commit run --from-ref <commit_id> --to-ref <commit_id>
Cherry-pick PR to 202405: https://github.com/sonic-net/sonic-mgmt/pull/13885