sriov-network-operator
sriov-network-operator copied to clipboard
feat: implement MlxResetFW to reset the FW on VF changes
Todos:
- [x] implement new method to
mstfwreset - [x] add feature flag to opt-in/opt-out
- [x] add unit test
- [x] test it manually or via e2e test job (preferred)
Manual test shows that the implementation is working:
sriov-network-config-daemon 2024-07-11T14:39:36.793959572Z INFO mellanox/mellanox_plugin.go:190 mellanox-plugin: resetFW() {"cmd-args": ["-d", "0000:03:00.0", "-l", "3", "-y", "reset"]}
foo@bar:~$ sudo mlxconfig -d /dev/mst/mt4119_pciconf0 q | grep -e SRIOV_EN -e NUM_OF_VFS
NUM_OF_VFS 4
SRIOV_EN True(1)
Thanks for your PR, To run vendors CIs use one of:
/test-all: To run all tests for all vendors./test-e2e-all: To run all E2E tests for all vendors./test-e2e-nvidia-all: To run all E2E tests for NVIDIA vendor.
To skip the vendors CIs use one of:
/skip-all: To skip all tests for all vendors./skip-e2e-all: To skip all E2E tests for all vendors./skip-e2e-nvidia-all: To skip all E2E tests for NVIDIA vendor. Best regards.
Thanks for your PR, To run vendors CIs use one of:
/test-all: To run all tests for all vendors./test-e2e-all: To run all E2E tests for all vendors./test-e2e-nvidia-all: To run all E2E tests for NVIDIA vendor.
To skip the vendors CIs use one of:
/skip-all: To skip all tests for all vendors./skip-e2e-all: To skip all E2E tests for all vendors./skip-e2e-nvidia-all: To skip all E2E tests for NVIDIA vendor. Best regards.
Thanks for your PR, To run vendors CIs use one of:
/test-all: To run all tests for all vendors./test-e2e-all: To run all E2E tests for all vendors./test-e2e-nvidia-all: To run all E2E tests for NVIDIA vendor.
To skip the vendors CIs use one of:
/skip-all: To skip all tests for all vendors./skip-e2e-all: To skip all E2E tests for all vendors./skip-e2e-nvidia-all: To skip all E2E tests for NVIDIA vendor. Best regards.
Thanks for your PR, To run vendors CIs use one of:
/test-all: To run all tests for all vendors./test-e2e-all: To run all E2E tests for all vendors./test-e2e-nvidia-all: To run all E2E tests for NVIDIA vendor.
To skip the vendors CIs use one of:
/skip-all: To skip all tests for all vendors./skip-e2e-all: To skip all E2E tests for all vendors./skip-e2e-nvidia-all: To skip all E2E tests for NVIDIA vendor. Best regards.
Thanks for your PR, To run vendors CIs use one of:
/test-all: To run all tests for all vendors./test-e2e-all: To run all E2E tests for all vendors./test-e2e-nvidia-all: To run all E2E tests for NVIDIA vendor.
To skip the vendors CIs use one of:
/skip-all: To skip all tests for all vendors./skip-e2e-all: To skip all E2E tests for all vendors./skip-e2e-nvidia-all: To skip all E2E tests for NVIDIA vendor. Best regards.
Pull Request Test Coverage Report for Build 10179580977
Details
- 1 of 63 (1.59%) changed or added relevant lines in 6 files are covered.
- 6 unchanged lines in 2 files lost coverage.
- Overall coverage decreased (-0.2%) to 43.816%
| Changes Missing Coverage | Covered Lines | Changed/Added Lines | % |
|---|---|---|---|
| pkg/daemon/daemon.go | 1 | 7 | 14.29% |
| pkg/helper/mock/mock_helper.go | 0 | 10 | 0.0% |
| pkg/plugins/mellanox/mellanox_plugin.go | 0 | 10 | 0.0% |
| pkg/vendors/mellanox/mock/mock_mellanox.go | 0 | 10 | 0.0% |
| cmd/sriov-network-config-daemon/start.go | 0 | 13 | 0.0% |
| pkg/vendors/mellanox/mellanox.go | 0 | 13 | 0.0% |
| <!-- | Total: | 1 | 63 |
| Files with Coverage Reduction | New Missed Lines | % |
|---|---|---|
| controllers/drain_controller.go | 1 | 68.06% |
| pkg/plugins/mellanox/mellanox_plugin.go | 5 | 7.64% |
| <!-- | Total: | 6 |
| Totals | |
|---|---|
| Change from base Build 10178846427: | -0.2% |
| Covered Lines: | 6536 |
| Relevant Lines: | 14917 |
💛 - Coveralls
Thanks for your PR, To run vendors CIs use one of:
/test-all: To run all tests for all vendors./test-e2e-all: To run all E2E tests for all vendors./test-e2e-nvidia-all: To run all E2E tests for NVIDIA vendor.
To skip the vendors CIs use one of:
/skip-all: To skip all tests for all vendors./skip-e2e-all: To skip all E2E tests for all vendors./skip-e2e-nvidia-all: To skip all E2E tests for NVIDIA vendor. Best regards.
/test-all
Thanks for your PR, To run vendors CIs use one of:
/test-all: To run all tests for all vendors./test-e2e-all: To run all E2E tests for all vendors./test-e2e-nvidia-all: To run all E2E tests for NVIDIA vendor.
To skip the vendors CIs use one of:
/skip-all: To skip all tests for all vendors./skip-e2e-all: To skip all E2E tests for all vendors./skip-e2e-nvidia-all: To skip all E2E tests for NVIDIA vendor. Best regards.
Thanks for your PR, To run vendors CIs use one of:
/test-all: To run all tests for all vendors./test-e2e-all: To run all E2E tests for all vendors./test-e2e-nvidia-all: To run all E2E tests for NVIDIA vendor.
To skip the vendors CIs use one of:
/skip-all: To skip all tests for all vendors./skip-e2e-all: To skip all E2E tests for all vendors./skip-e2e-nvidia-all: To skip all E2E tests for NVIDIA vendor. Best regards.
Thanks for your PR, To run vendors CIs use one of:
/test-all: To run all tests for all vendors./test-e2e-all: To run all E2E tests for all vendors./test-e2e-nvidia-all: To run all E2E tests for NVIDIA vendor.
To skip the vendors CIs use one of:
/skip-all: To skip all tests for all vendors./skip-e2e-all: To skip all E2E tests for all vendors./skip-e2e-nvidia-all: To skip all E2E tests for NVIDIA vendor. Best regards.
/test-all
Thanks for your PR, To run vendors CIs use one of:
/test-all: To run all tests for all vendors./test-e2e-all: To run all E2E tests for all vendors./test-e2e-nvidia-all: To run all E2E tests for NVIDIA vendor.
To skip the vendors CIs use one of:
/skip-all: To skip all tests for all vendors./skip-e2e-all: To skip all E2E tests for all vendors./skip-e2e-nvidia-all: To skip all E2E tests for NVIDIA vendor. Best regards.
@zeeke @adrianchiris the GH Action ci-triggers change seems to work. After a rebase and push no new comment has been added. But after closing and reopening there is one. 👍🏻
/test-all
/test-all
/test-all
/test-e2e-nvidia-all
/test-e2e-nvidia-all
/test-e2e-nvidia-all
/test-e2e-nvidia-all
Nvidia e2e test depends on https://github.com/k8snetworkplumbingwg/sriov-network-operator/pull/742
/test-all
/test-e2e-nvidia-all
/test-all
and if we reset the primary nic of the system we will not have an API connection to reconcile and we can leave the system in a bad state...
Usually the reset of the fw should work. Also the code of the mstfwrest binary is tested. Additionally, we have a feature gate top opt-in, which means this is not enabled for the default usage. IMHO we should rely on the robustness of the reset.
Or have you already had experience with the reset and that's why concerns?
WDYT @adrianchiris
Hi can you please rebase this one
LGTM from my sided. just needs rebase
Rebased, thanks for the hint. Didn't noticed the wrong commits
/test-all