network-observability-operator
network-observability-operator copied to clipboard
NETOBSERV-1747: trigger stored resource migration on operator startup
Description
Borrowing and adapting code from Tekton https://github.com/tektoncd/operator/blob/v0.72.0/pkg/reconciler/shared/tektonconfig/upgrade/helper/migrator.go
On startup, the operator will start the migration process, which is basically a noop if resources are already in the correct stored version, or else which triggers a conversion + rewrite by the api server (which itself relies on our conversion webhooks). The CRD status is then updated to only mention the current stored version.
Testing
Testing is not straightforward, here's the steps I did:
# Install an older CRD version where storedVersion was different (you don't need to install the operator, just the CRD is sufficient)
git checkout release-1.5
make install
oc apply -f config/samples/flows_v1beta1_flowcollector.yaml
# Check stored version is v1beta1
oc get crd flowcollectors.flows.netobserv.io -ojsonpath='{.status.storedVersions}'
["v1beta1"]
# Update the CRD to the new version
git checkout main
bin/kustomize build config/crd | kubectl apply --server-side --force-conflicts -f -
# Check stored version have both v1beta1 and v1beta2
oc get crd flowcollectors.flows.netobserv.io -ojsonpath='{.status.storedVersions}'
["v1beta1","v1beta2"]
# <-- AT THIS POINT, if nothing is done and later the v1beta1 version is removed, this would break the upgrades
# Deploy the operator
make deploy VERSION=... USER=...
oc get crd flowcollectors.flows.netobserv.io -ojsonpath='{.status.storedVersions}'
["v1beta2"]
You may also take a look at the operator logs. It will probably show some temporary errors, due to the webhook server not being ready yet when trying migration - the operator is just going to retry with an exponential backoff:
2024-09-17T11:59:40.460Z INFO ensuring stored resources are up to date
2024-09-17T11:59:40.460Z INFO migrating group version {"crdGroup": "flowcollectors.flows.netobserv.io"}
W0917 11:59:40.535415 1 reflector.go:561] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers.go:106: failed to list *v1beta2.FlowCollector: conversion webhook for flows.netobserv.io/v1beta1, Kind=FlowCollector failed: Post "https://netobserv-webhook-service.netobserv.svc:443/convert?timeout=30s": no endpoints available for service "netobserv-webhook-service"
...
2024-09-17T11:59:51.723Z INFO migration completed {"crdGroup": "flowcollectors.flows.netobserv.io"}
2024-09-17T11:59:51.723Z INFO migrating group version {"crdGroup": "flowmetrics.flows.netobserv.io"}
2024-09-17T11:59:51.807Z INFO migration completed {"crdGroup": "flowmetrics.flows.netobserv.io"}
Dependencies
n/a
Checklist
If you are not familiar with our processes or don't know what to answer in the list below, let us know in a comment: the maintainers will take care of that.
- [x] Is this PR backed with a JIRA ticket? If so, make sure it is written as a title prefix (in general, PRs affecting the NetObserv/Network Observability product should be backed with a JIRA ticket - especially if they bring user facing changes).
- [ ] Does this PR require product documentation?
- [ ] If so, make sure the JIRA epic is labelled with "documentation" and provides a description relevant for doc writers, such as use cases or scenarios. Any required step to activate or configure the feature should be documented there, such as new CRD knobs.
- [x] Does this PR require a product release notes entry?
- [ ] If so, fill in "Release Note Text" in the JIRA.
- [ ] Is there anything else the QE team should know before testing? E.g: configuration changes, environment setup, etc.
- [ ] If so, make sure it is described in the JIRA ticket.
- QE requirements (check 1 from the list):
- [x] Standard QE validation, with pre-merge tests unless stated otherwise.
- [ ] Regression tests only (e.g. refactoring with no user-facing change).
- [ ] No QE (e.g. trivial change with high reviewer's confidence, or per agreement with the QE team).
@jotak: This pull request references NETOBSERV-1747 which is a valid jira issue.
Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target the "4.18.0" version, but no target version was set.
In response to this:
Description
Borrowing and adapting code from Tekton https://github.com/tektoncd/operator/blob/v0.72.0/pkg/reconciler/shared/tektonconfig/upgrade/helper/migrator.go
On startup, the operator will start the migration process, which is basically a noop if resources are already in the correct stored version, or else which triggers a conversion + rewrite by the api server (which itself relies on our conversion webhooks). The CRD status is then updated to only mention the current stored version.
Dependencies
n/a
Checklist
If you are not familiar with our processes or don't know what to answer in the list below, let us know in a comment: the maintainers will take care of that.
- [x] Is this PR backed with a JIRA ticket? If so, make sure it is written as a title prefix (in general, PRs affecting the NetObserv/Network Observability product should be backed with a JIRA ticket - especially if they bring user facing changes).
- [ ] Does this PR require product documentation?
- [ ] If so, make sure the JIRA epic is labelled with "documentation" and provides a description relevant for doc writers, such as use cases or scenarios. Any required step to activate or configure the feature should be documented there, such as new CRD knobs.
- [x] Does this PR require a product release notes entry?
- [ ] If so, fill in "Release Note Text" in the JIRA.
- [ ] Is there anything else the QE team should know before testing? E.g: configuration changes, environment setup, etc.
- [ ] If so, make sure it is described in the JIRA ticket.
- QE requirements (check 1 from the list):
- [x] Standard QE validation, with pre-merge tests unless stated otherwise.
- [ ] Regression tests only (e.g. refactoring with no user-facing change).
- [ ] No QE (e.g. trivial change with high reviewer's confidence, or per agreement with the QE team).
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.
[APPROVALNOTIFIER] This PR is NOT APPROVED
This pull-request has been approved by: Once this PR has been reviewed and has the lgtm label, please ask for approval from jotak. For more information see the Kubernetes Code Review Process.
The full list of commands accepted by this bot can be found here.
Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment
cc @msherif1234 you might need to do something in the same vein in the operator(s) you maintain (ingress firewall, bpfman?)
Codecov Report
Attention: Patch coverage is 84.61538% with 12 lines in your changes missing coverage. Please review.
Project coverage is 66.76%. Comparing base (
efe1407) to head (e352caa). Report is 19 commits behind head on main.
| Files with missing lines | Patch % | Lines |
|---|---|---|
| pkg/migrator/migrator.go | 86.11% | 7 Missing and 3 partials :warning: |
| pkg/manager/manager.go | 66.66% | 1 Missing and 1 partial :warning: |
Additional details and impacted files
@@ Coverage Diff @@
## main #780 +/- ##
==========================================
+ Coverage 66.57% 66.76% +0.19%
==========================================
Files 75 76 +1
Lines 8726 8804 +78
==========================================
+ Hits 5809 5878 +69
- Misses 2505 2511 +6
- Partials 412 415 +3
| Flag | Coverage Δ | |
|---|---|---|
| unittests | 66.76% <84.61%> (+0.19%) |
:arrow_up: |
Flags with carried forward coverage won't be shown. Click here to find out more.
| Files with missing lines | Coverage Δ | |
|---|---|---|
| pkg/manager/manager.go | 66.66% <66.66%> (ø) |
|
| pkg/migrator/migrator.go | 86.11% <86.11%> (ø) |
@jotak: This pull request references NETOBSERV-1747 which is a valid jira issue.
Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target the "4.18.0" version, but no target version was set.
In response to this:
Description
Borrowing and adapting code from Tekton https://github.com/tektoncd/operator/blob/v0.72.0/pkg/reconciler/shared/tektonconfig/upgrade/helper/migrator.go
On startup, the operator will start the migration process, which is basically a noop if resources are already in the correct stored version, or else which triggers a conversion + rewrite by the api server (which itself relies on our conversion webhooks). The CRD status is then updated to only mention the current stored version.
Testing
Testing is not straightforward, here's the steps I did:
# Install an older CRD version where storedVersion was different (you don't need to install the operator, just the CRD is sufficient) git checkout release-1.5 make install oc apply -f config/samples/flows_v1beta1_flowcollector.yaml # Check stored version is v1beta1 oc get crd flowcollectors.flows.netobserv.io -ojsonpath='{.status.storedVersions}' ["v1beta1"] # Update the CRD to the new version git checkout main bin/kustomize build config/crd | kubectl apply --server-side --force-conflicts -f - # Check stored version have both v1beta1 and v1beta2 oc get crd flowcollectors.flows.netobserv.io -ojsonpath='{.status.storedVersions}' ["v1beta1","v1beta2"] # <-- AT THIS POINT, if nothing is done and later the v1beta1 version is removed, this would break the upgrades # Deploy the operator make deploy VERSION=... USER=... oc get crd flowcollectors.flows.netobserv.io -ojsonpath='{.status.storedVersions}' ["v1beta2"]You may also take a look at the operator logs. It will probably show some temporary errors, due to the webhook server not being ready yet when trying migration - the operator is just going to retry with an exponential backoff:
2024-09-17T11:59:40.460Z INFO ensuring stored resources are up to date 2024-09-17T11:59:40.460Z INFO migrating group version {"crdGroup": "flowcollectors.flows.netobserv.io"} W0917 11:59:40.535415 1 reflector.go:561] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers.go:106: failed to list *v1beta2.FlowCollector: conversion webhook for flows.netobserv.io/v1beta1, Kind=FlowCollector failed: Post "https://netobserv-webhook-service.netobserv.svc:443/convert?timeout=30s": no endpoints available for service "netobserv-webhook-service" ... 2024-09-17T11:59:51.723Z INFO migration completed {"crdGroup": "flowcollectors.flows.netobserv.io"} 2024-09-17T11:59:51.723Z INFO migrating group version {"crdGroup": "flowmetrics.flows.netobserv.io"} 2024-09-17T11:59:51.807Z INFO migration completed {"crdGroup": "flowmetrics.flows.netobserv.io"}Dependencies
n/a
Checklist
If you are not familiar with our processes or don't know what to answer in the list below, let us know in a comment: the maintainers will take care of that.
- [x] Is this PR backed with a JIRA ticket? If so, make sure it is written as a title prefix (in general, PRs affecting the NetObserv/Network Observability product should be backed with a JIRA ticket - especially if they bring user facing changes).
- [ ] Does this PR require product documentation?
- [ ] If so, make sure the JIRA epic is labelled with "documentation" and provides a description relevant for doc writers, such as use cases or scenarios. Any required step to activate or configure the feature should be documented there, such as new CRD knobs.
- [x] Does this PR require a product release notes entry?
- [ ] If so, fill in "Release Note Text" in the JIRA.
- [ ] Is there anything else the QE team should know before testing? E.g: configuration changes, environment setup, etc.
- [ ] If so, make sure it is described in the JIRA ticket.
- QE requirements (check 1 from the list):
- [x] Standard QE validation, with pre-merge tests unless stated otherwise.
- [ ] Regression tests only (e.g. refactoring with no user-facing change).
- [ ] No QE (e.g. trivial change with high reviewer's confidence, or per agreement with the QE team).
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.
New changes are detected. LGTM label has been removed.
/retest
@jotak: This pull request references NETOBSERV-1747 which is a valid jira issue.
Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target the "4.18.0" version, but no target version was set.
In response to this:
Description
Borrowing and adapting code from Tekton https://github.com/tektoncd/operator/blob/v0.72.0/pkg/reconciler/shared/tektonconfig/upgrade/helper/migrator.go
On startup, the operator will start the migration process, which is basically a noop if resources are already in the correct stored version, or else which triggers a conversion + rewrite by the api server (which itself relies on our conversion webhooks). The CRD status is then updated to only mention the current stored version.
Testing
Testing is not straightforward, here's the steps I did:
# Install an older CRD version where storedVersion was different (you don't need to install the operator, just the CRD is sufficient) git checkout release-1.5 make install oc apply -f config/samples/flows_v1beta1_flowcollector.yaml # Check stored version is v1beta1 oc get crd flowcollectors.flows.netobserv.io -ojsonpath='{.status.storedVersions}' ["v1beta1"] # Update the CRD to the new version git checkout main bin/kustomize build config/crd | kubectl apply --server-side --force-conflicts -f - # Check stored version have both v1beta1 and v1beta2 oc get crd flowcollectors.flows.netobserv.io -ojsonpath='{.status.storedVersions}' ["v1beta1","v1beta2"] # <-- AT THIS POINT, if nothing is done and later the v1beta1 version is removed, this would break the upgrades # Deploy the operator (this PR) make deploy VERSION=... USER=... oc get crd flowcollectors.flows.netobserv.io -ojsonpath='{.status.storedVersions}' ["v1beta2"]You may also take a look at the operator logs. It will probably show some temporary errors, due to the webhook server not being ready yet when trying migration - the operator is just going to retry with an exponential backoff:
2024-09-17T11:59:40.460Z INFO ensuring stored resources are up to date 2024-09-17T11:59:40.460Z INFO migrating group version {"crdGroup": "flowcollectors.flows.netobserv.io"} W0917 11:59:40.535415 1 reflector.go:561] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers.go:106: failed to list *v1beta2.FlowCollector: conversion webhook for flows.netobserv.io/v1beta1, Kind=FlowCollector failed: Post "https://netobserv-webhook-service.netobserv.svc:443/convert?timeout=30s": no endpoints available for service "netobserv-webhook-service" ... 2024-09-17T11:59:51.723Z INFO migration completed {"crdGroup": "flowcollectors.flows.netobserv.io"} 2024-09-17T11:59:51.723Z INFO migrating group version {"crdGroup": "flowmetrics.flows.netobserv.io"} 2024-09-17T11:59:51.807Z INFO migration completed {"crdGroup": "flowmetrics.flows.netobserv.io"}Dependencies
n/a
Checklist
If you are not familiar with our processes or don't know what to answer in the list below, let us know in a comment: the maintainers will take care of that.
- [x] Is this PR backed with a JIRA ticket? If so, make sure it is written as a title prefix (in general, PRs affecting the NetObserv/Network Observability product should be backed with a JIRA ticket - especially if they bring user facing changes).
- [ ] Does this PR require product documentation?
- [ ] If so, make sure the JIRA epic is labelled with "documentation" and provides a description relevant for doc writers, such as use cases or scenarios. Any required step to activate or configure the feature should be documented there, such as new CRD knobs.
- [x] Does this PR require a product release notes entry?
- [ ] If so, fill in "Release Note Text" in the JIRA.
- [ ] Is there anything else the QE team should know before testing? E.g: configuration changes, environment setup, etc.
- [ ] If so, make sure it is described in the JIRA ticket.
- QE requirements (check 1 from the list):
- [x] Standard QE validation, with pre-merge tests unless stated otherwise.
- [ ] Regression tests only (e.g. refactoring with no user-facing change).
- [ ] No QE (e.g. trivial change with high reviewer's confidence, or per agreement with the QE team).
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.
/ok-to-test
New images:
- quay.io/netobserv/network-observability-operator:2bd3884
- quay.io/netobserv/network-observability-operator-bundle:v0.0.0-2bd3884
- quay.io/netobserv/network-observability-operator-catalog:v0.0.0-2bd3884
They will expire after two weeks.
To deploy this build:
# Direct deployment, from operator repo
IMAGE=quay.io/netobserv/network-observability-operator:2bd3884 make deploy
# Or using operator-sdk
operator-sdk run bundle quay.io/netobserv/network-observability-operator-bundle:v0.0.0-2bd3884
Or as a Catalog Source:
apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
name: netobserv-dev
namespace: openshift-marketplace
spec:
sourceType: grpc
image: quay.io/netobserv/network-observability-operator-catalog:v0.0.0-2bd3884
displayName: NetObserv development catalog
publisher: Me
updateStrategy:
registryPoll:
interval: 1m
@jotak I tried the above steps but could not see the migration happening. I still see v1beta1 in the storedVersion list. I did capture the controller logs and have attached it below.
@jotak: This pull request references NETOBSERV-1747 which is a valid jira issue.
Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target the "4.18.0" version, but no target version was set.
In response to this:
Description
Borrowing and adapting code from Tekton https://github.com/tektoncd/operator/blob/v0.72.0/pkg/reconciler/shared/tektonconfig/upgrade/helper/migrator.go
On startup, the operator will start the migration process, which is basically a noop if resources are already in the correct stored version, or else which triggers a conversion + rewrite by the api server (which itself relies on our conversion webhooks). The CRD status is then updated to only mention the current stored version.
Testing
Testing is not straightforward, here's the steps I did:
# Install an older CRD version where storedVersion was different (you don't need to install the operator, just the CRD is sufficient) git checkout release-1.5 make install oc apply -f config/samples/flows_v1beta1_flowcollector.yaml # Check stored version is v1beta1 oc get crd flowcollectors.flows.netobserv.io -ojsonpath='{.status.storedVersions}' ["v1beta1"] # Update the CRD to the new version git checkout main # Assuming your main branch is up-to-date, else please update it bin/kustomize build config/crd | kubectl apply --server-side --force-conflicts -f - # Check stored version have both v1beta1 and v1beta2 oc get crd flowcollectors.flows.netobserv.io -ojsonpath='{.status.storedVersions}' ["v1beta1","v1beta2"] # <-- AT THIS POINT, if nothing is done and later the v1beta1 version is removed, this would break the upgrades # Deploy the operator (this PR) make deploy VERSION=... USER=... oc get crd flowcollectors.flows.netobserv.io -ojsonpath='{.status.storedVersions}' ["v1beta2"]You may also take a look at the operator logs. It will probably show some temporary errors, due to the webhook server not being ready yet when trying migration - the operator is just going to retry with an exponential backoff:
2024-09-17T11:59:40.460Z INFO ensuring stored resources are up to date 2024-09-17T11:59:40.460Z INFO migrating group version {"crdGroup": "flowcollectors.flows.netobserv.io"} W0917 11:59:40.535415 1 reflector.go:561] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers.go:106: failed to list *v1beta2.FlowCollector: conversion webhook for flows.netobserv.io/v1beta1, Kind=FlowCollector failed: Post "https://netobserv-webhook-service.netobserv.svc:443/convert?timeout=30s": no endpoints available for service "netobserv-webhook-service" ... 2024-09-17T11:59:51.723Z INFO migration completed {"crdGroup": "flowcollectors.flows.netobserv.io"} 2024-09-17T11:59:51.723Z INFO migrating group version {"crdGroup": "flowmetrics.flows.netobserv.io"} 2024-09-17T11:59:51.807Z INFO migration completed {"crdGroup": "flowmetrics.flows.netobserv.io"}Dependencies
n/a
Checklist
If you are not familiar with our processes or don't know what to answer in the list below, let us know in a comment: the maintainers will take care of that.
- [x] Is this PR backed with a JIRA ticket? If so, make sure it is written as a title prefix (in general, PRs affecting the NetObserv/Network Observability product should be backed with a JIRA ticket - especially if they bring user facing changes).
- [ ] Does this PR require product documentation?
- [ ] If so, make sure the JIRA epic is labelled with "documentation" and provides a description relevant for doc writers, such as use cases or scenarios. Any required step to activate or configure the feature should be documented there, such as new CRD knobs.
- [x] Does this PR require a product release notes entry?
- [ ] If so, fill in "Release Note Text" in the JIRA.
- [ ] Is there anything else the QE team should know before testing? E.g: configuration changes, environment setup, etc.
- [ ] If so, make sure it is described in the JIRA ticket.
- QE requirements (check 1 from the list):
- [x] Standard QE validation, with pre-merge tests unless stated otherwise.
- [ ] Regression tests only (e.g. refactoring with no user-facing change).
- [ ] No QE (e.g. trivial change with high reviewer's confidence, or per agreement with the QE team).
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.
New images:
- quay.io/netobserv/network-observability-operator:eddd6d8
- quay.io/netobserv/network-observability-operator-bundle:v0.0.0-eddd6d8
- quay.io/netobserv/network-observability-operator-catalog:v0.0.0-eddd6d8
They will expire after two weeks.
To deploy this build:
# Direct deployment, from operator repo
IMAGE=quay.io/netobserv/network-observability-operator:eddd6d8 make deploy
# Or using operator-sdk
operator-sdk run bundle quay.io/netobserv/network-observability-operator-bundle:v0.0.0-eddd6d8
Or as a Catalog Source:
apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
name: netobserv-dev
namespace: openshift-marketplace
spec:
sourceType: grpc
image: quay.io/netobserv/network-observability-operator-catalog:v0.0.0-eddd6d8
displayName: NetObserv development catalog
publisher: Me
updateStrategy:
registryPoll:
interval: 1m
@jotak: This pull request references NETOBSERV-1747 which is a valid jira issue.
Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target the "4.18.0" version, but no target version was set.
In response to this:
Description
Borrowing and adapting code from Tekton https://github.com/tektoncd/operator/blob/v0.72.0/pkg/reconciler/shared/tektonconfig/upgrade/helper/migrator.go
On startup, the operator will start the migration process, which is basically a noop if resources are already in the correct stored version, or else which triggers a conversion + rewrite by the api server (which itself relies on our conversion webhooks). The CRD status is then updated to only mention the current stored version.
Testing
Testing is not straightforward, here's the steps I did:
# Install an older CRD version where storedVersion was different (you don't need to install the operator, just the CRD is sufficient) git checkout release-1.5 make install oc apply -f config/samples/flows_v1beta1_flowcollector.yaml # Check stored version is v1beta1 oc get crd flowcollectors.flows.netobserv.io -ojsonpath='{.status.storedVersions}' ["v1beta1"] # Update the CRD to the new version git checkout main # Assuming your main branch is up-to-date, else please update it bin/kustomize build config/crd | kubectl apply --server-side --force-conflicts -f - # Check stored version have both v1beta1 and v1beta2 oc get crd flowcollectors.flows.netobserv.io -ojsonpath='{.status.storedVersions}' ["v1beta1","v1beta2"] # <-- AT THIS POINT, if nothing is done and later the v1beta1 version is removed, this would break the upgrades # Deploy the operator (this PR) git checkout (this PR) make deploy VERSION=... USER=... oc get crd flowcollectors.flows.netobserv.io -ojsonpath='{.status.storedVersions}' ["v1beta2"]You may also take a look at the operator logs. It will probably show some temporary errors, due to the webhook server not being ready yet when trying migration - the operator is just going to retry with an exponential backoff:
2024-09-17T11:59:40.460Z INFO ensuring stored resources are up to date 2024-09-17T11:59:40.460Z INFO migrating group version {"crdGroup": "flowcollectors.flows.netobserv.io"} W0917 11:59:40.535415 1 reflector.go:561] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers.go:106: failed to list *v1beta2.FlowCollector: conversion webhook for flows.netobserv.io/v1beta1, Kind=FlowCollector failed: Post "https://netobserv-webhook-service.netobserv.svc:443/convert?timeout=30s": no endpoints available for service "netobserv-webhook-service" ... 2024-09-17T11:59:51.723Z INFO migration completed {"crdGroup": "flowcollectors.flows.netobserv.io"} 2024-09-17T11:59:51.723Z INFO migrating group version {"crdGroup": "flowmetrics.flows.netobserv.io"} 2024-09-17T11:59:51.807Z INFO migration completed {"crdGroup": "flowmetrics.flows.netobserv.io"}Dependencies
n/a
Checklist
If you are not familiar with our processes or don't know what to answer in the list below, let us know in a comment: the maintainers will take care of that.
- [x] Is this PR backed with a JIRA ticket? If so, make sure it is written as a title prefix (in general, PRs affecting the NetObserv/Network Observability product should be backed with a JIRA ticket - especially if they bring user facing changes).
- [ ] Does this PR require product documentation?
- [ ] If so, make sure the JIRA epic is labelled with "documentation" and provides a description relevant for doc writers, such as use cases or scenarios. Any required step to activate or configure the feature should be documented there, such as new CRD knobs.
- [x] Does this PR require a product release notes entry?
- [ ] If so, fill in "Release Note Text" in the JIRA.
- [ ] Is there anything else the QE team should know before testing? E.g: configuration changes, environment setup, etc.
- [ ] If so, make sure it is described in the JIRA ticket.
- QE requirements (check 1 from the list):
- [x] Standard QE validation, with pre-merge tests unless stated otherwise.
- [ ] Regression tests only (e.g. refactoring with no user-facing change).
- [ ] No QE (e.g. trivial change with high reviewer's confidence, or per agreement with the QE team).
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.
@jotak I tried the above steps but could not see the migration happening. I still see v1beta1 in the storedVersion list. I did capture the controller logs and have attached it below.
@Amoghrd OK there was 2 issues:
- first this PR had to be rebased (it's 2 months old!) so deploying the operator based on main's CRD was generating errors
- second, something was missing in my test steps, because this PR includes RBAC changes it needs to be deployed from the PR itself; so I added
git checkout (this PR)in the test steps.
@jotak: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:
| Test name | Commit | Details | Required | Rerun command |
|---|---|---|---|---|
| ci/prow/e2e-operator | eeb6e46bfa1473726b56b8ef62c0987c6ae908f7 | link | false | /test e2e-operator |
Full PR test history. Your PR dashboard.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.
That helped thanks Joel! /label qe-approved
@jotak: This pull request references NETOBSERV-1747 which is a valid jira issue.
Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target the "4.18.0" version, but no target version was set.
In response to this:
Description
Borrowing and adapting code from Tekton https://github.com/tektoncd/operator/blob/v0.72.0/pkg/reconciler/shared/tektonconfig/upgrade/helper/migrator.go
On startup, the operator will start the migration process, which is basically a noop if resources are already in the correct stored version, or else which triggers a conversion + rewrite by the api server (which itself relies on our conversion webhooks). The CRD status is then updated to only mention the current stored version.
Testing
Testing is not straightforward, here's the steps I did:
# Install an older CRD version where storedVersion was different (you don't need to install the operator, just the CRD is sufficient) git checkout release-1.5 make install oc apply -f config/samples/flows_v1beta1_flowcollector.yaml # Check stored version is v1beta1 oc get crd flowcollectors.flows.netobserv.io -ojsonpath='{.status.storedVersions}' ["v1beta1"] # Update the CRD to the new version git checkout main # Assuming your main branch is up-to-date, else please update it bin/kustomize build config/crd | kubectl apply --server-side --force-conflicts -f - # Check stored version have both v1beta1 and v1beta2 oc get crd flowcollectors.flows.netobserv.io -ojsonpath='{.status.storedVersions}' ["v1beta1","v1beta2"] # <-- AT THIS POINT, if nothing is done and later the v1beta1 version is removed, this would break the upgrades # Deploy the operator (this PR) git checkout (this PR) make deploy VERSION=... USER=... oc get crd flowcollectors.flows.netobserv.io -ojsonpath='{.status.storedVersions}' ["v1beta2"]You may also take a look at the operator logs. It will probably show some temporary errors, due to the webhook server not being ready yet when trying migration - the operator is just going to retry with an exponential backoff:
2024-09-17T11:59:40.460Z INFO ensuring stored resources are up to date 2024-09-17T11:59:40.460Z INFO migrating group version {"crdGroup": "flowcollectors.flows.netobserv.io"} W0917 11:59:40.535415 1 reflector.go:561] sigs.k8s.io/controller-runtime/pkg/cache/internal/informers.go:106: failed to list *v1beta2.FlowCollector: conversion webhook for flows.netobserv.io/v1beta1, Kind=FlowCollector failed: Post "https://netobserv-webhook-service.netobserv.svc:443/convert?timeout=30s": no endpoints available for service "netobserv-webhook-service" ... 2024-09-17T11:59:51.723Z INFO migration completed {"crdGroup": "flowcollectors.flows.netobserv.io"} 2024-09-17T11:59:51.723Z INFO migrating group version {"crdGroup": "flowmetrics.flows.netobserv.io"} 2024-09-17T11:59:51.807Z INFO migration completed {"crdGroup": "flowmetrics.flows.netobserv.io"}Dependencies
n/a
Checklist
If you are not familiar with our processes or don't know what to answer in the list below, let us know in a comment: the maintainers will take care of that.
- [x] Is this PR backed with a JIRA ticket? If so, make sure it is written as a title prefix (in general, PRs affecting the NetObserv/Network Observability product should be backed with a JIRA ticket - especially if they bring user facing changes).
- [ ] Does this PR require product documentation?
- [ ] If so, make sure the JIRA epic is labelled with "documentation" and provides a description relevant for doc writers, such as use cases or scenarios. Any required step to activate or configure the feature should be documented there, such as new CRD knobs.
- [x] Does this PR require a product release notes entry?
- [ ] If so, fill in "Release Note Text" in the JIRA.
- [ ] Is there anything else the QE team should know before testing? E.g: configuration changes, environment setup, etc.
- [ ] If so, make sure it is described in the JIRA ticket.
- QE requirements (check 1 from the list):
- [x] Standard QE validation, with pre-merge tests unless stated otherwise.
- [ ] Regression tests only (e.g. refactoring with no user-facing change).
- [ ] No QE (e.g. trivial change with high reviewer's confidence, or per agreement with the QE team).
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.