oracle-database-operator Creating Single Instance Database Clone fails

Creating Single Instance Database Clone fails with the following error:

[2023:12:05 08:44:20]: Acquiring lock .ORAW.create_lck with heartbeat 30 secs
[2023:12:05 08:44:20]: Lock acquired
[2023:12:05 08:44:20]: Starting heartbeat
[2023:12:05 08:44:20]: Lock held .ORAW.create_lck
ORACLE EDITION: ENTERPRISE
[WARNING] [DBT-11217] Unable to check available shared memory on specified node(s) ([10]).
Prepare for db operation
[FATAL] [DBT-06006] Unable to create directory: (/opt/oracle/oradata/ORAW).
   CAUSE: Proper permissions are not granted to create the directory or there is no space left in the volume.
[ 2023-12-05 08:44:32.566 UTC ] [WARNING] [DBT-11217] Unable to check available shared memory on specified node(s) ([10]).
[ 2023-12-05 08:46:43.404 UTC ] Prepare for db operation
[ 2023-12-05 08:46:43.473 UTC ] [FATAL] [DBT-06006] Unable to create directory: (/opt/oracle/oradata/ORAW).

LSNRCTL for Linux: Version 21.0.0.0.0 - Production on 05-DEC-2023 08:46:43

Copyright (c) 1991, 2021, Oracle.  All rights reserved.

Connecting to (ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=1521))
TNS-12541: TNS:no listener
 TNS-12560: TNS:protocol adapter error
  TNS-00511: No listener
   Linux Error: 111: Connection refused

Configuration file used was based on https://github.com/oracle/oracle-database-operator/blob/main/config/samples/sidb/singleinstancedatabase_clone.yaml, just updated name, namespace, sid and cloneFrom.

Before attempting to create the clone I had one Primary and two Physical standby instances running without issues:

kubectl -n oracle-database get singleinstancedatabase
NAME      EDITION      STATUS    ROLE               VERSION      CONNECT STR             TCPS CONNECT STR   OEM EXPRESS URL
szczyrk   Enterprise   Healthy   PHYSICAL_STANDBY   21.3.0.0.0   10.1.2.46:32533/ORAS    Unavailable        https://10.1.2.46:32428/em
ustron    Enterprise   Healthy   PHYSICAL_STANDBY   21.3.0.0.0   10.1.2.46:31519/ORAU    Unavailable        https://10.1.2.46:31105/em
zywiec    Enterprise   Healthy   PRIMARY            21.3.0.0.0   10.1.2.46:31761/ORAZ    Unavailable        https://10.1.2.46:30398/em

After executing kubectl apply on singleinstancedatabase_clone:

kubectl -n oracle-database get singleinstancedatabase
NAME      EDITION      STATUS     ROLE               VERSION       CONNECT STR             TCPS CONNECT STR   OEM EXPRESS URL
szczyrk   Enterprise   Healthy    PHYSICAL_STANDBY   21.3.0.0.0    10.1.2.46:32407/ORAS    Unavailable        https://10.1.2.46:31656/em
ustron    Enterprise   Healthy    PHYSICAL_STANDBY   21.3.0.0.0    10.1.2.46:32678/ORAU    Unavailable        https://10.1.2.46:31596/em
wisla     Enterprise   Creating   Unavailable        Unavailable   10.1.3.159:32557/ORAW   Unavailable        Unavailable
zywiec    Enterprise   Healthy    PRIMARY            21.3.0.0.0    10.1.1.7:32193/ORAZ     Unavailable        https://10.1.1.7:31245/em

kubectl -n oracle-database get pods
NAME            READY   STATUS             RESTARTS      AGE
szczyrk-wx4zj   1/1     Running            0             161m
ustron-9ifi2    1/1     Running            0             154m
wisla-mtqfs     0/1     CrashLoopBackOff   21 (4m ago)   139m
zywiec-vlf14    1/1     Running            0             3h17m

kubectl -n oracle-database get pvc
NAME      STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
szczyrk   Bound    pvc-ce59ebf8-11b3-4b90-8493-3a9bcd1a8d06   10Gi       RWO            gp2            27m
ustron    Bound    pvc-55696dbb-714a-4c20-8bbb-8ba85314cf8a   10Gi       RWO            gp2            20m
wisla     Bound    pvc-283d044d-74f4-4289-b3b7-4cd769e35bfa   10Gi       RWO            gp2            5m13s
zywiec    Bound    pvc-3f4b83cb-d861-4963-8bae-5d7badf1eca6   10Gi       RWO            gp2            63m

Environment:

AWS EKS 1.25
StorageClass gp2, provisioner kubernetes.io/aws-ebs
oracle-database-operator 1.0.0

Dec 05 '23 11:12 andbos

Hi @andbos you are basically cloning the primary sidb right and not one of the two standby sidbs ? if yes please also share the operator pod logs

Dec 22 '23 11:12 IshaanDesai45

Hi,

Yes, I cloned the primary instance. I tried again, now with only one instance in total (a Primary) prior to the clone operation.

$ kubectl --kubeconfig ~/.kube/config-sinch-op-smsf-1-andbos -n oracle-database get singleinstancedatabase
NAME     EDITION      STATUS    ROLE      VERSION      CONNECT STR             TCPS CONNECT STR   OEM EXPRESS URL
zywiec   Enterprise   Healthy   PRIMARY   21.3.0.0.0   10.1.3.159:31170/ORAZ   Unavailable        https://10.1.3.159:30994/em

$ kubectl --kubeconfig -n oracle-database get pvc
NAME     STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
zywiec   Bound    pvc-37535c5d-f6df-489f-b91b-ed3656bd6e8e   50Gi       RWO            gp2            8d

$ kubectl -n oracle-database get pods
NAME           READY   STATUS    RESTARTS   AGE
zywiec-vpxps   1/1     Running   0          8d

$ kubectl -n oracle-database apply -f ~/oracle/singleinstancedatabase_clone.yaml
singleinstancedatabase.database.oracle.com/wisla created

$ kubectl -n oracle-database get pods
NAME           READY   STATUS    RESTARTS      AGE
wisla-wnnrh    0/1     Running   1 (72s ago)   3m54s
zywiec-vpxps   1/1     Running   0             8d

$ kubectl -n oracle-database get pvc
NAME     STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
wisla    Bound    pvc-1cec290e-f461-4b66-81fc-e0069a22480b   50Gi       RWO            gp2            3m58s
zywiec   Bound    pvc-37535c5d-f6df-489f-b91b-ed3656bd6e8e   50Gi       RWO            gp2            8d

$ kubectl -n oracle-database get singleinstancedatabase
NAME     EDITION      STATUS     ROLE          VERSION       CONNECT STR             TCPS CONNECT STR   OEM EXPRESS URL
wisla    Enterprise   Creating   Unavailable   Unavailable   10.1.1.7:30200/ORAW     Unavailable        Unavailable
zywiec   Enterprise   Healthy    PRIMARY       21.3.0.0.0    10.1.3.159:31170/ORAZ   Unavailable        https://10.1.3.159:30994/em

$ kubectl -n oracle-database logs -f wisla-wnnrh
Defaulted container "wisla" out of: wisla, init-permissions (init), init-wallet (init)
[2023:12:22 12:39:46]: Acquiring lock .ORAW.create_lck with heartbeat 30 secs
[2023:12:22 12:39:46]: Lock acquired
[2023:12:22 12:39:46]: Starting heartbeat
[2023:12:22 12:39:46]: Lock held .ORAW.create_lck
ORACLE EDITION: ENTERPRISE
[WARNING] [DBT-11217] Unable to check available shared memory on specified node(s) ([10]).
Prepare for db operation
[FATAL] [DBT-06006] Unable to create directory: (/opt/oracle/oradata/ORAW).
   CAUSE: Proper permissions are not granted to create the directory or there is no space left in the volume.
[ 2023-12-22 12:39:57.762 UTC ] [WARNING] [DBT-11217] Unable to check available shared memory on specified node(s) ([10]).
[ 2023-12-22 12:42:07.441 UTC ] Prepare for db operation
[ 2023-12-22 12:42:07.506 UTC ] [FATAL] [DBT-06006] Unable to create directory: (/opt/oracle/oradata/ORAW).

LSNRCTL for Linux: Version 21.0.0.0.0 - Production on 22-DEC-2023 12:42:07

Copyright (c) 1991, 2021, Oracle.  All rights reserved.

Connecting to (ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=1521))
TNS-12541: TNS:no listener
 TNS-12560: TNS:protocol adapter error
  TNS-00511: No listener
   Linux Error: 111: Connection refused
$ date
Fri Dec 22 13:42:27 CET 2023

oracle-database-operator-controller-manager_log.txt singleinstancedatabase_clone.yaml.txt singleinstancedatabase.yaml.txt

I have uploaded the operator-controller-manager pod logs (taken with stern so logs from all three pods are present) my YAML files.

Best regards, Andreas

Dec 22 '23 12:12 andbos

@andbos please check if the /opt/oracle/oradata which is a directory mounted with a volume have enough space, if not then try expanding the volume source or using a new volume.

Jan 30 '24 09:01 IshaanDesai45

Hi,

Thanks. It turns out I had assigned insufficient storage (tried to save money during testing). I doubled the storage to 100Gi and after that the creation of the clone went well. A bit strange that it was possible to create the initial primary with as little at 10Gi but not a clone. What's the minimal amount?

$ kubectl -n oracle-database apply -f ~/oracle/singleinstancedatabase_clone.yaml
singleinstancedatabase.database.oracle.com/wisla created

$ kubectl -n oracle-database get pods
NAME           READY   STATUS    RESTARTS   AGE
wisla-ly4z2    1/1     Running   0          14m
zywiec-q14pe   1/1     Running   0          26m

$ kubectl -n oracle-database get singleinstancedatabase
NAME     EDITION      STATUS    ROLE      VERSION      CONNECT STR            TCPS CONNECT STR   OEM EXPRESS URL
wisla    Enterprise   Healthy   PRIMARY   21.3.0.0.0   10.1.2.46:30531/ORAW   Unavailable        https://10.1.2.46:30474/em
zywiec   Enterprise   Healthy   PRIMARY   21.3.0.0.0   10.1.2.46:32102/ORAZ   Unavailable        https://10.1.2.46:31749/em

From above output it is impossible to know which instance is the original primary and which is the clone. Maybe that could be highlighted somehow in a future release? As far as I understand the clone is like a snapshot or a backup taken at a specific time and there is no syncing, the two primary instances are not clustered.

Best regards, Andreas

Feb 02 '24 09:02 andbos

Aha now I see that you can determine which instance is a clone by checking Clone From under Status and Spec in the describe singleinstance output.

$ kubectl --kubeconfig ~/.kube/config-sinch-op-smsf-1-andbos -n oracle-database describe singleinstancedatabase wisla
Name:         wisla
Namespace:    oracle-database
Labels:       <none>
Annotations:  <none>
API Version:  database.oracle.com/v1alpha1
Kind:         SingleInstanceDatabase
Metadata:
  Creation Timestamp:  2024-02-02T08:57:59Z
  Finalizers:
    database.oracle.com/singleinstancedatabasefinalizer
  Generation:        1
  Resource Version:  156829610
  UID:               3a3ec8cc-302e-4320-aabc-22f2feebc918
Spec:
  Admin Password:
    Keep Secret:  true
    Secret Key:   oracle_pwd
    Secret Name:  db-admin-secret
  Clone From:     zywiec
  Image:
    Pull From:     container-registry.oracle.com/database/enterprise:latest
    Pull Secrets:  oracle-container-registry-secret
  Init Params:
  Pdb Name:  ORCLPDB1
  Persistence:
    Access Mode:    ReadWriteOnce
    Size:           100Gi
    Storage Class:  gp2
  Replicas:         1
  Sid:              ORAW
Status:
  Archive Log:             false
  Clone From:              zywiec

Feb 02 '24 09:02 andbos

@andbos closing this issue as I think the issue was resolved upon expanding the block volume attached to the sidb resource pod.

Feel free to reopen or reach out if problem persists

Jun 18 '24 10:06 IshaanDesai45

oracle-database-operator oracle-database-operator copied to clipboard

Creating Single Instance Database Clone fails

oracle-database-operator
oracle-database-operator copied to clipboard