SONiC
SONiC copied to clipboard
High-Level Design to reduce disk IO on SONiC switches
This PR is intended to introduce a High-level deisgn to reduce disk writes on SONiC switches
| Repo | PR Title | State |
|---|---|---|
| sonic-buildimage | Code optimizations to reduce disk writes on SONiC switches | In Review |
Signed-off-by: Ashwin Srinivasan [email protected]
@zbud-msft Hi Zain, Can you please review these changes, Thanks
Can we please add telemetry/test_telemetry.py and telemetry/test_telemetry_cert_rotation.py to multi-asic pr script:
https://github.com/sonic-net/sonic-mgmt/blob/master/.azure-pipelines/pr_test_scripts.yaml#L426
Can we please add telemetry/test_telemetry.py and telemetry/test_telemetry_cert_rotation.py to multi-asic pr script:
https://github.com/sonic-net/sonic-mgmt/blob/master/.azure-pipelines/pr_test_scripts.yaml#L426
@zbud-msft Currently telemetry test has an issue of test_telemetry_queue_buffer_cnt which will fail and results in all following test to fail also. This test is not valid for chassis with type VOQ so one PR is created to skip this test(https://github.com/sonic-net/sonic-mgmt/pull/14694). It seems with single-asic there might still have issue on server side (orchagent). Which might need to skip this test. MSFT has verified after skip this test all telemetry tests passed.
I am worrying to add test_telemetry into PR script at this moment as it could always fail because of this in non voq scenario.
Should we wait to add it to PR script after the issue has been resolved on server side? Or let's skip this test using https://github.com/sonic-net/sonic-buildimage/issues/19624 so then we can add test_telemetry to PR scripts. Please advice.
@zbud-msft : Can you please help with sign off.
/azp run Azure.sonic-mgmt
Azure Pipelines successfully started running 1 pipeline(s).
@wumiaont , the PR test is failing of telemetry test, could you please check?
/azp run Azure.sonic-mgmt
Commenter does not have sufficient privileges for PR 14982 in repo sonic-net/sonic-mgmt
@wumiaont , the PR test is failing of telemetry test, could you please check?
Fixed that. But Azure.sonic-mgmt failed after re-run and I don't have privilege to run azp
/azp run
Azure Pipelines successfully started running 1 pipeline(s).
Hi @zbud-msft , the PR test passed, could you please help to review?
Hi @wumiaont I see in pr_test_scripts.yaml file does not contain telemetry/test_telemetry.py. https://github.com/sonic-net/sonic-mgmt/pull/14694 is merged, can we add now?
Hi @wumiaont I see in pr_test_scripts.yaml file does not contain telemetry/test_telemetry.py. #14694 is merged, can we add now?
Added. And PR pineline test passed.
@wumiaont telemetry is failing in the multi-asic pr test, could you please check? https://elastictest.org/scheduler/testplan/672951f0a649df99f35b949c?leftSideViewMode=detail&testcase=telemetry%2Ftest_telemetry.py%7C%7C%7C2&type=console
@zbud-msft @yejianquan The test_telemetry_queue_buffer_cnt failed. I made a fix to skip this(https://github.com/sonic-net/sonic-mgmt/pull/14694) but it's only for type is VOQ(apply to Nokia chassis). For multi-asic there are still issues with this test case. Mgmt side needs multi-asic support. Orchagent side could possibly have issue(https://github.com/sonic-net/sonic-buildimage/issues/19624).
My recommendation is to not add test_telemetry for t1 multi-asic test or add test_telemetry but skip this test_telemetry_queue_buffer_cnt if device is multi-asic. I could not make multi-asic support to mgmt as I don't have device with multi-asic and not VOQ to test possible fixes. Please advice.
@zbud-msft @yejianquan The test_telemetry_queue_buffer_cnt failed. I made a fix to skip this(#14694) but it's only for type is VOQ(apply to Nokia chassis). For multi-asic there are still issues with this test case. Mgmt side needs multi-asic support. Orchagent side could possibly have issue(sonic-net/sonic-buildimage#19624).
My recommendation is to not add test_telemetry for t1 multi-asic test or add test_telemetry but skip this test_telemetry_queue_buffer_cnt if device is multi-asic. I could not make multi-asic support to mgmt as I don't have device with multi-asic and not VOQ to test possible fixes. Please advice.
Hi @wumiaont , let's enable test_telemetry and skip test_telemetry_queue_buffer_cnt with issue, sample: https://github.com/sonic-net/sonic-mgmt/blob/0d7e6caa68f98852a44717448c217755897edabd/tests/common/plugins/conditional_mark/tests_mark_conditions.yaml#L381
@zbud-msft @yejianquan The test_telemetry_queue_buffer_cnt failed. I made a fix to skip this(#14694) but it's only for type is VOQ(apply to Nokia chassis). For multi-asic there are still issues with this test case. Mgmt side needs multi-asic support. Orchagent side could possibly have issue(sonic-net/sonic-buildimage#19624). My recommendation is to not add test_telemetry for t1 multi-asic test or add test_telemetry but skip this test_telemetry_queue_buffer_cnt if device is multi-asic. I could not make multi-asic support to mgmt as I don't have device with multi-asic and not VOQ to test possible fixes. Please advice.
Hi @wumiaont , let's enable test_telemetry and skip test_telemetry_queue_buffer_cnt with issue, sample:
https://github.com/sonic-net/sonic-mgmt/blob/0d7e6caa68f98852a44717448c217755897edabd/tests/common/plugins/conditional_mark/tests_mark_conditions.yaml#L381
Hi @yejianquan Created issue https://github.com/sonic-net/sonic-mgmt/issues/15393 and then skipped this test_telemetry_queue_buffer_cnt for multi-asic device with this issue. t1 multi asic tests all passed includes test_telemetry.py tests. Currently Azure.sonic-mgmt has an issue. Can you please re-run Azure.sonic-mgmt? Thx.
/azpw run
/azp run
Commenter does not have sufficient privileges for PR 14982 in repo sonic-net/sonic-mgmt
Hi @zbud-msft , kindly approve this PR if it looks good to you
@wumiaont PR conflicts with 202405 branch
@wumiaont PR conflicts with 202405 branch
Use https://github.com/sonic-net/sonic-mgmt/pull/15463 to resolve the conflict