api icon indicating copy to clipboard operation
api copied to clipboard

SPLAT-2044 - [vsphere] promote multi-nic to default

Open rvanderp3 opened this issue 9 months ago • 6 comments

Associated PRs:

  • [ ] https://github.com/openshift/installer/pull/9493

rvanderp3 avatar Feb 17 '25 15:02 rvanderp3

Hello @rvanderp3! Some important instructions when contributing to openshift/api: API design plays an important part in the user experience of OpenShift and as such API PRs are subject to a high level of scrutiny to ensure they follow our best practices. If you haven't already done so, please review the OpenShift API Conventions and ensure that your proposed changes are compliant. Following these conventions will help expedite the api review process for your PR.

openshift-ci[bot] avatar Feb 17 '25 16:02 openshift-ci[bot]

For the most part, the associated PRs you have raised are likely not needed. Enabling the gate in this repo will allow the MAO and CPMS to pick up the gate change using their gate observers. There's nothing I can think of in client-go or library-go that would require vendoring this change for a gate promotion either.

The installer PR is likely needed though

JoelSpeed avatar Feb 18 '25 10:02 JoelSpeed

For the most part, the associated PRs you have raised are likely not needed. Enabling the gate in this repo will allow the MAO and CPMS to pick up the gate change using their gate observers. There's nothing I can think of in client-go or library-go that would require vendoring this change for a gate promotion either.

The installer PR is likely needed though

Thanks @JoelSpeed for taking a look. I wasn't sure how that feature gate change would propagate. That certainly simplifies things.

rvanderp3 avatar Feb 18 '25 14:02 rvanderp3

@rvanderp3: This PR was included in a payload test run from openshift/installer#9493 trigger 0 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

openshift-ci[bot] avatar Mar 31 '25 19:03 openshift-ci[bot]

@rvanderp3: This PR was included in a payload test run from openshift/installer#9493 trigger 0 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

openshift-ci[bot] avatar Apr 03 '25 13:04 openshift-ci[bot]

@rvanderp3: This PR was included in a payload test run from openshift/installer#9493 trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-master-nightly-4.19-e2e-vsphere-ovn-multi-network

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/e63e99d0-1091-11f0-815a-95d1230ecb0a-0

openshift-ci[bot] avatar Apr 03 '25 13:04 openshift-ci[bot]

@rvanderp3: This PR was included in a payload test run from openshift/installer#9493 trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-master-nightly-4.19-e2e-vsphere-ovn-multi-network

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/659390a0-10b5-11f0-8279-b4f2deeb2bab-0

openshift-ci[bot] avatar Apr 03 '25 17:04 openshift-ci[bot]

/test verify-feature-promotion

rvanderp3 avatar Apr 10 '25 15:04 rvanderp3

/lgtm

jcpowermac avatar Apr 10 '25 16:04 jcpowermac

@rvanderp3 Where are we at in terms of testing this feature? Do we have specific testing or is an "if it installs, it works" kind of feature, and if that's the case, can you explain what jobs we have in place?

JoelSpeed avatar Apr 11 '25 07:04 JoelSpeed

/lgtm

WenXinWei avatar Apr 11 '25 10:04 WenXinWei

/approve

WenXinWei avatar Apr 11 '25 10:04 WenXinWei

@rvanderp3 Where are we at in terms of testing this feature? Do we have specific testing or is an "if it installs, it works" kind of feature, and if that's the case, can you explain what jobs we have in place?

Hi @JoelSpeed we've been working on two approaches in case we run out of time:

  1. Adding additional e2e tests - https://github.com/openshift/machine-api-operator/pull/1327
  2. QE is testing(has tested; thanks @WenXinWei !) this as well in the event we need an exception

We did have an excellent pass rate for the tech preview job, but for some reason(must have been me :( ) its not showing up in prow. I'm investigating that.

rvanderp3 avatar Apr 11 '25 12:04 rvanderp3

@rvanderp3 Where are we at in terms of testing this feature? Do we have specific testing or is an "if it installs, it works" kind of feature, and if that's the case, can you explain what jobs we have in place?

Hi @JoelSpeed we've been working on two approaches in case we run out of time:

1. Adding additional e2e tests - [OCPBUGS-49351: Added vSphere check for max networks machine-api-operator#1327](https://github.com/openshift/machine-api-operator/pull/1327)

2. QE is testing(has tested; thanks @WenXinWei !) this as well in the event we need an exception

We did have an excellent pass rate for the tech preview job, but for some reason(must have been me :( ) its not showing up in prow. I'm investigating that.

me--

https://github.com/openshift/release/pull/63351/files#diff-7ce8489656df8d956c50ed7923f223cabe266e2146255f892fd3f980ba664e20R485-R506

getting a fix in now. i swapped the crons eroneously

rvanderp3 avatar Apr 11 '25 12:04 rvanderp3

/assign @JoelSpeed

rvanderp3 avatar Apr 14 '25 14:04 rvanderp3

Without tests reporting into sippy, we run the risk of not catching regressions in the features we support. As far as I can tell, there are currently no tests for this gate reporting into sippy.

Do you have both tech preview and stable periodics set up that are testing this feature already?

JoelSpeed avatar Apr 14 '25 16:04 JoelSpeed

Without tests reporting into sippy, we run the risk of not catching regressions in the features we support. As far as I can tell, there are currently no tests for this gate reporting into sippy.

Do you have both tech preview and stable periodics set up that are testing this feature already?

Correct, that is why we had QE test it last week. I am working on getting the tests reported in to Sippy(https://prow.ci.openshift.org/view/gs/test-platform-results/logs/periodic-ci-openshift-release-master-nightly-4.19-e2e-vsphere-ovn-multi-network-techpreview/1911775910613225472) but until last week, we didnt have the framework merged to run those tests. I'm working on fixing the test preview tests at the moment.

rvanderp3 avatar Apr 14 '25 17:04 rvanderp3

Without tests reporting into sippy, we run the risk of not catching regressions in the features we support. As far as I can tell, there are currently no tests for this gate reporting into sippy. Do you have both tech preview and stable periodics set up that are testing this feature already?

Correct, that is why we had QE test it last week. I am working on getting the tests reported in to Sippy(https://prow.ci.openshift.org/view/gs/test-platform-results/logs/periodic-ci-openshift-release-master-nightly-4.19-e2e-vsphere-ovn-multi-network-techpreview/1911775910613225472) but until last week, we didnt have the framework merged to run those tests. I'm working on fixing the test preview tests at the moment.

looks like my test is running with CustomNoUpgrade rather than TechPreview. fixing that now.

rvanderp3 avatar Apr 14 '25 17:04 rvanderp3

@JoelSpeed tech preview periodics have been fixed https://prow.ci.openshift.org/job-history/gs/test-platform-results/logs/periodic-ci-openshift-release-master-nightly-4.19-e2e-vsphere-ovn-multi-network-techpreview. I'm trying to retrieve payload test results but it appears pr-payload-tests.ci.openshift.org may be having issues. Will try another payload run.

rvanderp3 avatar Apr 16 '25 12:04 rvanderp3

and just as I post that, the site comes back up. here is a passing payload test https://pr-payload-tests.ci.openshift.org/runs/ci/659390a0-10b5-11f0-8279-b4f2deeb2bab-0

rvanderp3 avatar Apr 16 '25 12:04 rvanderp3

Hi @JoelSpeed after CI stabilized, we have had a large percentage of passing runs. Since QE approved, we have passing payload and the tech preview job is stable, do we need anything else to promote?

rvanderp3 avatar Apr 21 '25 18:04 rvanderp3

/test verify-feature-promotion

JoelSpeed avatar Apr 22 '25 11:04 JoelSpeed

@rvanderp3 I still don't see any data reporting into component readiness/sippy for this feature. Without data in sippy, we don't have the ability for the TRT folks to automatically monitor for regressions in the feature.

As of 4.19, data in sippy has become mandatory for all feature promotions.

JoelSpeed avatar Apr 22 '25 11:04 JoelSpeed

What tests exist for this feature? Is this a case of "if the install works, the feature works", or do we have specific tests in origin showing that the feature is working? Eg can I create a machineset day 2 that leverages this feature?

JoelSpeed avatar Apr 22 '25 11:04 JoelSpeed

Ah, my apologies, that's my mistake. I thought we could still have qe sign off on it in 4.19 so thats the approach i was focusing on in parallel with getting the tests running. We have the tests but for some reason they aren't running or reporting and the recent ci issues have made it tough to make headway. We'll take care of this in 4.20. Thanks for taking a look.

rvanderp3 avatar Apr 22 '25 11:04 rvanderp3

/test verify-feature-promotion

rvanderp3 avatar May 14 '25 16:05 rvanderp3

/test verify-feature-promotion

rvanderp3 avatar May 19 '25 13:05 rvanderp3

/test verify-feature-promotion

rvanderp3 avatar May 19 '25 14:05 rvanderp3

/test verify-feature-promotion

rvanderp3 avatar May 20 '25 20:05 rvanderp3

/lgtm

jcpowermac avatar May 20 '25 20:05 jcpowermac