oracle-database-operator icon indicating copy to clipboard operation
oracle-database-operator copied to clipboard

ORDS replicaset and service not reconciled

Open tenstad opened this issue 1 year ago • 4 comments

When changing e.g. ordsImage in a CDB resource, the changes are not reflected on the ReplicaSet. And if I delete the ReplicaSet, the operator does not create a new one. I would expect the operator to watch both the CDB and the ReplicaSet, and whenever changes are made to either one of them, it should ensure that a ReplicaSet exists in a state matching what is desired from the CDB. Same goes for the ORDS Service resource.

The reconcile when the ReplicaSet existed:

2024-02-14T15:49:07Z INFO controllers.CDB Reconcile requested {"multitenantoperator": "oracle-database-operator-system/cdb-dev"}
2024-02-14T15:49:07Z INFO controllers.CDB Res Status: {"multitenantoperator": "oracle-database-operator-system/cdb-dev", "Name": "cdb-dev", "Phase": "Ready", "Status": "true"}
2024-02-14T15:49:07Z INFO controllers.CDB Reconcile completed {"multitenantoperator": "oracle-database-operator-system/cdb-dev"}
2024-02-14T15:49:07Z INFO controllers.CDB DEFER {"multitenantoperator": "oracle-database-operator-system/cdb-dev", "Name": "cdb-dev", "Phase": "Ready", "Status": "true"}

A reconcile after deleting the ReplicaSet:

2024-02-14T15:58:23Z INFO controllers.CDB Reconcile requested {"multitenantoperator": "oracle-database-operator-system/cdb-dev"}
2024-02-14T15:58:23Z INFO controllers.CDB Res Status: {"multitenantoperator": "oracle-database-operator-system/cdb-dev", "Name": "cdb-dev", "Phase": "Ready", "Status": "true"}
2024-02-14T15:58:23Z INFO controllers.CDB DEFER {"multitenantoperator": "oracle-database-operator-system/cdb-dev", "Name": "cdb-dev", "Phase": "Ready", "Status": "true"}
2024-02-14T15:58:23Z INFO Observed a panic in reconciler: runtime error: index out of range [0] with length 0 {"controller": "cdb", "controllerGroup": "database.oracle.com", "controllerKind": "CDB", "CDB": {"name":"cdb-dev","namespace":"oracle-database-operator-system"}, "namespace": "oracle-database-operator-system", "name": "cdb-dev", "reconcileID": "7bac0cef-013c-4588-9804-c1f105164054"}
panic: runtime error: index out of range [0] with length 0 [recovered]
panic: runtime error: index out of range [0] with length 0
goroutine 974 [running]:
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile.func1()
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:118 +0x1f4
panic({0x1f26900, 0xc003cd6060})
/usr/local/go/src/runtime/panic.go:884 +0x212
github.com/oracle/oracle-database-operator/controllers/database.(*CDBReconciler).evaluateSpecChange(0xc0003dacd0, {0x23adeb8, 0xc003cdd3e0}, {{{0xc0007c4a00, 0x1f}, {0xc0009ff9f6, 0x7}}}, 0xc001753b80)
/workspace/controllers/database/cdb_controller.go:638 +0xe8f
github.com/oracle/oracle-database-operator/controllers/database.(*CDBReconciler).Reconcile(0xc0003dacd0, {0x23adeb8, 0xc003cdd3e0}, {{{0xc0007c4a00, 0x1f}, {0xc0009ff9f6, 0x7}}})
/workspace/controllers/database/cdb_controller.go:156 +0x57a
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile(0x23ade10?, {0x23adeb8?, 0xc003cdd3e0?}, {{{0xc0007c4a00?, 0x1f8b6e0?}, {0xc0009ff9f6?, 0xc000762dd0?}}})
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:121 +0xc8
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc000193040, {0x23ade10, 0xc0008b0600}, {0x1e1d180?, 0xc00060e560?})
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:320 +0x33c
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc000193040, {0x23ade10, 0xc0008b0600})
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:273 +0x1d9
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2()
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:234 +0x85
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:230 +0x333

tenstad avatar Feb 14 '24 15:02 tenstad

Can you please specify the exact steps to reproduce this issue thx

mmalvezz avatar Apr 23 '24 14:04 mmalvezz

Don't have the operator installed anymore, but can easily be reproduced.

  1. Create CDB with a ordsImage, then update the ordsImage.
  2. Create CDB, then find the ReplicaSet just created. Delete the ReplicaSet.

tenstad avatar Apr 28 '24 12:04 tenstad

Please give us much more detail about step1. Why do you need to update ordsImage?... it's something static (designed to pdb managment only) there should be no need to update ordsImage . Did you come across some ordsImage issue?

mmalvezz avatar Apr 29 '24 09:04 mmalvezz

Sorry, it's not a two step process, but two separate examples. And no, I do not really need to update the image, it's just an example field to illustrate the lack of reconciliation.

When I in Kubernetes have a resource (CDB) describing the state I want, I expect the actual state to eventually equal the desired one. This includes updating my desired state (CDB), which should in turn update the actual state.

In this case, I expect the controller to ensure that the StatefulSet always complies to the description in the CDB. It should reflect any changes I make to the CDB resource, like updating the image field. If such behavior is not wanted, I should be unable to change the image field after creating the CDB.

In the other case, when I "accidentally" delete the StatefulSet, it should immediately be recreated. Through creating the CDB resource I have said I want to have that thing, meaning the controller should do what it can for me to have it.

The current create-and-forget approach is very fragile, and I would suggest implementing watches for both the CDB and all resources it creates. If any of them changes it needs to act and ensure that all the needed resources still complies to all fields of the CDB.

tenstad avatar May 06 '24 19:05 tenstad