assisted-service
assisted-service copied to clipboard
MGMT-17700: Assign none platform node-ips based on connected addresses and etcd restrictions
When none platform is in use, if there is ambiguity in node-ip assignment, then incorrect assignment might lead to installation failure. This happens when etcd detects that the socket address from an etcd node does not match the expected address in the peer certificate. In this case etcd rejects such connection.
Example: assuming two networks - net1 and net2. master node 1 has 1 address that belongs to net1. master node 2 has 2 addresses. one that belongs to net 1, and another that belongs to net 2 master node 3 has 1 address that belongs to net 1. If the selected node-ip of master node 2 belongs to net 2, then when it will create a connection with any other master node, the socket address will be the address that belongs to net 1. Since etcd expects it to be the same as the node-ip, it will reject the connection.
This can be solved by node-ips selection that will not cause such a conflict. Node ips assignment should be done through ignition. To correctly set bootstrap ip, the machine-network for the cluster must be set to match the selected node-ip for that host.
MGMT-17701: Add capability to calculate none platform node-ips based on L3 connected addresses and connectivity
Calculate the node-ips for none platform cluster. Node ip calculation is actually calculation of the node-ip, hint, and cidr. Node-ip is the ip address that exists on the host that was selected as the node-ip. Hint is actually an IP address that does not exist on the host, but must belong to the subnet of the node-ip. Cidr is the subnet in cidr notation that the node-ip and hint belong to. Node-ip calculation is either done for all cluster hosts or none of them.
MGMT-17702: Modify ignition to use calculated node-ips for none platform
In order to set node-ip, ignition is modified. A file called /etc/default/nodeip-configuration contains the hint as was set by node-ip generation. In addition the bootstrap-ip is set in ignition as was set in node-ip generation. This is set as environment variable called "OPENSHIFT_INSTALL_BOOTSTRAP_NODE_IP".
MGMT-17703: Modify machine-network based on calculated node-ips
The etcd in boostrap uses he machine-network to set the IP address. Therefore during install-config generation the machine-network may be replaced by the cidr from the node-ip generation.
List all the issues related to this PR
- [x] New Feature
- [ ] Enhancement
- [ ] Bug fix
- [ ] Tests
- [ ] Documentation
- [ ] CI/CD
What environments does this code impact?
- [ ] Automation (CI, tools, etc)
- [x] Cloud
- [x] Operator Managed Deployments
- [ ] None
How was this code tested?
- [x] assisted-test-infra environment
- [ ] dev-scripts environment
- [ ] Reviewer's test appreciated
- [ ] Waiting for CI to do a full test run
- [ ] Manual (Elaborate on how it was tested)
- [ ] No tests needed
Checklist
- [ ] Title and description added to both, commit and PR.
- [ ] Relevant issues have been associated (see CONTRIBUTING guide)
- [ ] This change does not require a documentation update (docstring,
docs
, README, etc) - [ ] Does this change include unit-tests (note that code changes require unit-tests)
Reviewers Checklist
- Are the title and description (in both PR and commit) meaningful and clear?
- Is there a bug required (and linked) for this change?
- Should this PR be backported?
/cc @tsorya /cc @paul-maidment
@ori-amizur: This pull request references MGMT-17700 which is a valid jira issue.
Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.16.0" version, but no target version was set.
In response to this:
When none platform is in use, if there is ambiguity in node-ip assignment, then incorrect assignment might lead to installation failure. This happens when etcd detects that the socket address from an etcd node does not match the expected address in the peer certificate. In this case etcd rejects such connection.
Example: assuming to networks - net1 and net2. master node 1 has 1 address that belongs to net1. master node 2 has 2 addresses. one that belongs to net 1, and another that belongs to net 2 master node 3 has 1 address that belongs to net 1. If the selected node-ip of master node 2 belongs to net 2, then when it will create a connection with any other master node, the socket address will be the address that belongs to net 1. Since etcd expects it to be the same as the node-ip, it will reject the connection.
This can be solved by node-ips selection that will not cause such a conflict. Node ips assignment should be done through ignition. To correctly set bootstrap ip, the machine-network for the cluster must be set to match the selected node-ip for that host.
MGMT-17701: Add capability to calculate none platform node-ips based on L3 connected addresses and connectivity
Calculate the node-ips for none platform cluster. Node ip calculation is actually calculation of the node-ip, hint, and cidr. Node-ip is the ip address that exists on the host that was selected as the node-ip. Hint is actually an IP address that does not exist on the host, but must belong to the subnet of the node-ip. Cidr is the subnet in cidr notation that the node-ip and hint belong to. Node-ip calculation is either done for all cluster hosts or none of them.
MGMT-17702: Modify ignition to use calculated node-ips for none platform
In order to set node-ip, ignition is modified. A file called /etc/default/nodeip-configuration contains the hint as was set by node-ip generation. In addition the bootstrap-ip is set in ignition as was set in node-ip generation. This is set as environment variable called "OPENSHIFT_INSTALL_BOOTSTRAP_NODE_IP".
MGMT-17703: Modify machine-network based on calculated node-ips
The etcd in boostrap uses he machine-network to set the IP address. Therefore during install-config generation the machine-network may be replaced by the cidr from the node-ip generation.
List all the issues related to this PR
- [x] New Feature
- [ ] Enhancement
- [ ] Bug fix
- [ ] Tests
- [ ] Documentation
- [ ] CI/CD
What environments does this code impact?
- [ ] Automation (CI, tools, etc)
- [x] Cloud
- [x] Operator Managed Deployments
- [ ] None
How was this code tested?
- [x] assisted-test-infra environment
- [ ] dev-scripts environment
- [ ] Reviewer's test appreciated
- [ ] Waiting for CI to do a full test run
- [ ] Manual (Elaborate on how it was tested)
- [ ] No tests needed
Checklist
- [ ] Title and description added to both, commit and PR.
- [ ] Relevant issues have been associated (see CONTRIBUTING guide)
- [ ] This change does not require a documentation update (docstring,
docs
, README, etc)- [ ] Does this change include unit-tests (note that code changes require unit-tests)
Reviewers Checklist
- Are the title and description (in both PR and commit) meaningful and clear?
- Is there a bug required (and linked) for this change?
- Should this PR be backported?
/cc @tsorya /cc @paul-maidment
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.
[APPROVALNOTIFIER] This PR is APPROVED
This pull-request has been approved by: ori-amizur
The full list of commands accepted by this bot can be found here.
The pull request process is described here
- ~~OWNERS~~ [ori-amizur]
Approvers can indicate their approval by writing /approve
in a comment
Approvers can cancel approval by writing /approve cancel
in a comment
Codecov Report
Attention: Patch coverage is 78.12500%
with 42 lines
in your changes are missing coverage. Please review.
Project coverage is 68.43%. Comparing base (
4f4c464
) to head (c2e4687
).
Additional details and impacted files
@@ Coverage Diff @@
## master #6257 +/- ##
==========================================
+ Coverage 68.25% 68.43% +0.17%
==========================================
Files 244 245 +1
Lines 35869 36060 +191
==========================================
+ Hits 24483 24676 +193
+ Misses 9223 9205 -18
- Partials 2163 2179 +16
Files | Coverage Δ | |
---|---|---|
internal/common/common.go | 23.90% <0.00%> (-0.20%) |
:arrow_down: |
internal/ignition/installmanifests.go | 55.67% <66.66%> (+0.27%) |
:arrow_up: |
...ernal/network/none_platform_node_ips_allocation.go | 80.72% <80.72%> (ø) |
/test edge-subsystem-kubeapi-aws
@ori-amizur: This pull request references MGMT-17700 which is a valid jira issue.
Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.16.0" version, but no target version was set.
In response to this:
When none platform is in use, if there is ambiguity in node-ip assignment, then incorrect assignment might lead to installation failure. This happens when etcd detects that the socket address from an etcd node does not match the expected address in the peer certificate. In this case etcd rejects such connection.
Example: assuming two networks - net1 and net2. master node 1 has 1 address that belongs to net1. master node 2 has 2 addresses. one that belongs to net 1, and another that belongs to net 2 master node 3 has 1 address that belongs to net 1. If the selected node-ip of master node 2 belongs to net 2, then when it will create a connection with any other master node, the socket address will be the address that belongs to net 1. Since etcd expects it to be the same as the node-ip, it will reject the connection.
This can be solved by node-ips selection that will not cause such a conflict. Node ips assignment should be done through ignition. To correctly set bootstrap ip, the machine-network for the cluster must be set to match the selected node-ip for that host.
MGMT-17701: Add capability to calculate none platform node-ips based on L3 connected addresses and connectivity
Calculate the node-ips for none platform cluster. Node ip calculation is actually calculation of the node-ip, hint, and cidr. Node-ip is the ip address that exists on the host that was selected as the node-ip. Hint is actually an IP address that does not exist on the host, but must belong to the subnet of the node-ip. Cidr is the subnet in cidr notation that the node-ip and hint belong to. Node-ip calculation is either done for all cluster hosts or none of them.
MGMT-17702: Modify ignition to use calculated node-ips for none platform
In order to set node-ip, ignition is modified. A file called /etc/default/nodeip-configuration contains the hint as was set by node-ip generation. In addition the bootstrap-ip is set in ignition as was set in node-ip generation. This is set as environment variable called "OPENSHIFT_INSTALL_BOOTSTRAP_NODE_IP".
MGMT-17703: Modify machine-network based on calculated node-ips
The etcd in boostrap uses he machine-network to set the IP address. Therefore during install-config generation the machine-network may be replaced by the cidr from the node-ip generation.
List all the issues related to this PR
- [x] New Feature
- [ ] Enhancement
- [ ] Bug fix
- [ ] Tests
- [ ] Documentation
- [ ] CI/CD
What environments does this code impact?
- [ ] Automation (CI, tools, etc)
- [x] Cloud
- [x] Operator Managed Deployments
- [ ] None
How was this code tested?
- [x] assisted-test-infra environment
- [ ] dev-scripts environment
- [ ] Reviewer's test appreciated
- [ ] Waiting for CI to do a full test run
- [ ] Manual (Elaborate on how it was tested)
- [ ] No tests needed
Checklist
- [ ] Title and description added to both, commit and PR.
- [ ] Relevant issues have been associated (see CONTRIBUTING guide)
- [ ] This change does not require a documentation update (docstring,
docs
, README, etc)- [ ] Does this change include unit-tests (note that code changes require unit-tests)
Reviewers Checklist
- Are the title and description (in both PR and commit) meaningful and clear?
- Is there a bug required (and linked) for this change?
- Should this PR be backported?
/cc @tsorya /cc @paul-maidment
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.
/test?
@ori-amizur: The following commands are available to trigger required jobs:
-
/test e2e-agent-compact-ipv4
-
/test edge-assisted-operator-catalog-publish-verify
-
/test edge-ci-index
-
/test edge-e2e-ai-operator-ztp
-
/test edge-e2e-ai-operator-ztp-sno-day2-workers
-
/test edge-e2e-ai-operator-ztp-sno-day2-workers-late-binding
-
/test edge-e2e-metal-assisted
-
/test edge-e2e-metal-assisted-4-11
-
/test edge-e2e-metal-assisted-4-12
-
/test edge-e2e-metal-assisted-cnv
-
/test edge-e2e-metal-assisted-lvm
-
/test edge-e2e-metal-assisted-odf
-
/test edge-images
-
/test edge-lint
-
/test edge-subsystem-aws
-
/test edge-subsystem-kubeapi-aws
-
/test edge-unit-test
-
/test edge-verify-generated-code
-
/test images
-
/test mce-images
The following commands are available to trigger optional jobs:
-
/test e2e-agent-ha-dualstack
-
/test e2e-agent-sno-ipv6
-
/test edge-e2e-ai-operator-disconnected-capi
-
/test edge-e2e-ai-operator-ztp-3masters
-
/test edge-e2e-ai-operator-ztp-capi
-
/test edge-e2e-ai-operator-ztp-compact-day2-masters
-
/test edge-e2e-ai-operator-ztp-compact-day2-workers
-
/test edge-e2e-ai-operator-ztp-disconnected
-
/test edge-e2e-ai-operator-ztp-hypershift-zero-nodes
-
/test edge-e2e-ai-operator-ztp-multiarch-3masters-ocp
-
/test edge-e2e-ai-operator-ztp-multiarch-sno-ocp
-
/test edge-e2e-ai-operator-ztp-node-labels
-
/test edge-e2e-ai-operator-ztp-sno-day2-masters
-
/test edge-e2e-ai-operator-ztp-sno-day2-workers-ignitionoverride
-
/test edge-e2e-metal-assisted-4-13
-
/test edge-e2e-metal-assisted-4-14
-
/test edge-e2e-metal-assisted-4-15
-
/test edge-e2e-metal-assisted-bond
-
/test edge-e2e-metal-assisted-bond-4-14
-
/test edge-e2e-metal-assisted-day2
-
/test edge-e2e-metal-assisted-day2-arm-workers
-
/test edge-e2e-metal-assisted-day2-single-node
-
/test edge-e2e-metal-assisted-external
-
/test edge-e2e-metal-assisted-external-4-14
-
/test edge-e2e-metal-assisted-ipv4v6
-
/test edge-e2e-metal-assisted-ipv6
-
/test edge-e2e-metal-assisted-kube-api-late-binding-single-node
-
/test edge-e2e-metal-assisted-kube-api-late-unbinding-ipv4-single-node
-
/test edge-e2e-metal-assisted-kube-api-net-suite
-
/test edge-e2e-metal-assisted-mce-4-11
-
/test edge-e2e-metal-assisted-mce-4-12
-
/test edge-e2e-metal-assisted-mce-4-13
-
/test edge-e2e-metal-assisted-mce-4-14
-
/test edge-e2e-metal-assisted-mce-4-15
-
/test edge-e2e-metal-assisted-mce-sno
-
/test edge-e2e-metal-assisted-metallb
-
/test edge-e2e-metal-assisted-none
-
/test edge-e2e-metal-assisted-onprem
-
/test edge-e2e-metal-assisted-single-node
-
/test edge-e2e-metal-assisted-static-ip-suite
-
/test edge-e2e-metal-assisted-static-ip-suite-4-14
-
/test edge-e2e-metal-assisted-tang
-
/test edge-e2e-metal-assisted-tpmv2
-
/test edge-e2e-metal-assisted-upgrade-agent
-
/test edge-e2e-nutanix-assisted
-
/test edge-e2e-nutanix-assisted-2workers
-
/test edge-e2e-nutanix-assisted-4-14
-
/test edge-e2e-oci-assisted
-
/test edge-e2e-oci-assisted-4-14
-
/test edge-e2e-oci-assisted-iscsi
-
/test edge-e2e-vsphere-assisted
-
/test edge-e2e-vsphere-assisted-4-12
-
/test edge-e2e-vsphere-assisted-4-13
-
/test edge-e2e-vsphere-assisted-4-14
-
/test edge-e2e-vsphere-assisted-umn
-
/test okd-scos-images
-
/test push-pr-image
Use /test all
to run the following jobs that were automatically triggered:
-
pull-ci-openshift-assisted-service-master-e2e-agent-compact-ipv4
-
pull-ci-openshift-assisted-service-master-edge-ci-index
-
pull-ci-openshift-assisted-service-master-edge-e2e-ai-operator-ztp
-
pull-ci-openshift-assisted-service-master-edge-e2e-metal-assisted
-
pull-ci-openshift-assisted-service-master-edge-e2e-metal-assisted-external
-
pull-ci-openshift-assisted-service-master-edge-e2e-metal-assisted-none
-
pull-ci-openshift-assisted-service-master-edge-e2e-nutanix-assisted
-
pull-ci-openshift-assisted-service-master-edge-e2e-nutanix-assisted-4-14
-
pull-ci-openshift-assisted-service-master-edge-e2e-oci-assisted
-
pull-ci-openshift-assisted-service-master-edge-e2e-oci-assisted-4-14
-
pull-ci-openshift-assisted-service-master-edge-e2e-vsphere-assisted
-
pull-ci-openshift-assisted-service-master-edge-images
-
pull-ci-openshift-assisted-service-master-edge-lint
-
pull-ci-openshift-assisted-service-master-edge-subsystem-aws
-
pull-ci-openshift-assisted-service-master-edge-subsystem-kubeapi-aws
-
pull-ci-openshift-assisted-service-master-edge-unit-test
-
pull-ci-openshift-assisted-service-master-edge-verify-generated-code
-
pull-ci-openshift-assisted-service-master-images
-
pull-ci-openshift-assisted-service-master-mce-images
In response to this:
/test?
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
/test edge-e2e-metal-assisted-none
/test edge-e2e-metal-assisted
@ori-amizur: The following tests failed, say /retest
to rerun all failed tests or /retest-required
to rerun all mandatory failed tests:
Test name | Commit | Details | Required | Rerun command |
---|---|---|---|---|
ci/prow/edge-e2e-vsphere-assisted | a63c03f8b2f3458fce13ad2ed5e68dccabe5eb0f | link | false | /test edge-e2e-vsphere-assisted |
ci/prow/edge-e2e-oci-assisted | a63c03f8b2f3458fce13ad2ed5e68dccabe5eb0f | link | false | /test edge-e2e-oci-assisted |
ci/prow/edge-e2e-oci-assisted-4-14 | a63c03f8b2f3458fce13ad2ed5e68dccabe5eb0f | link | false | /test edge-e2e-oci-assisted-4-14 |
ci/prow/edge-e2e-nutanix-assisted | a63c03f8b2f3458fce13ad2ed5e68dccabe5eb0f | link | false | /test edge-e2e-nutanix-assisted |
ci/prow/edge-e2e-nutanix-assisted-4-14 | a63c03f8b2f3458fce13ad2ed5e68dccabe5eb0f | link | false | /test edge-e2e-nutanix-assisted-4-14 |
Full PR test history. Your PR dashboard.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.
/retest-required
/retest-required
/lgtm
/retest-required
Remaining retests: 0 against base HEAD 4f4c46493887bed73e88b5661f4f2a503d5ccf35 and 2 for PR HEAD 7425516c2a60d230159a3a4f3addd30c2aca2e56 in total
/retest-required
/test edge-e2e-ai-operator-ztp
@ori-amizur: The following tests failed, say /retest
to rerun all failed tests or /retest-required
to rerun all mandatory failed tests:
Test name | Commit | Details | Required | Rerun command |
---|---|---|---|---|
ci/prow/edge-e2e-nutanix-assisted | c2e46879bb54d033db33ca8ee65147f21b2f8037 | link | false | /test edge-e2e-nutanix-assisted |
ci/prow/edge-e2e-nutanix-assisted-4-14 | c2e46879bb54d033db33ca8ee65147f21b2f8037 | link | false | /test edge-e2e-nutanix-assisted-4-14 |
ci/prow/edge-e2e-oci-assisted | c2e46879bb54d033db33ca8ee65147f21b2f8037 | link | false | /test edge-e2e-oci-assisted |
ci/prow/edge-e2e-vsphere-assisted | c2e46879bb54d033db33ca8ee65147f21b2f8037 | link | false | /test edge-e2e-vsphere-assisted |
Full PR test history. Your PR dashboard.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.
/lgtm
[ART PR BUILD NOTIFIER]
This PR has been included in build ose-agent-installer-api-server-container-v4.17.0-202405271211.p0.g8a63485.assembly.stream.el9 for distgit ose-agent-installer-api-server. All builds following this will include this PR.