oracle-database-operator icon indicating copy to clipboard operation
oracle-database-operator copied to clipboard

DG Broker doesn't detect that primary database is down

Open andbos opened this issue 8 months ago • 8 comments

Hi,

It seems DG Broker is not able to detect that primary database is down, the status is Healthy all the time.

Standby detected that the primary is down:

 rfs (PID:2778): Possible network disconnect with primary database [krsv.c:4855]
 rfs (PID:2778): while processing B-1171608090.T-1.S-21 [krsv.c:4861]
2024-06-14T10:18:55.090551+00:00


***********************************************************************

Fatal NI connect error 12541, connecting to:
 (DESCRIPTION=(CONNECT_DATA=(SERVICE_NAME=db11)(INSTANCE_NAME=DB11)(CID=(PROGRAM=oracle)(HOST=sinchdb12-dwake)(USER=oracle))(CONNECTION_ID=Gtegq8X5CdTgYxoCAQoNsQ==))(ADDRESS=(PROTOCOL=tcp)(HOST=172.20.114.142)(PORT=1521)))

  VERSION INFORMATION:
        TNS for Linux: Version 21.0.0.0.0 - Production
        TCP/IP NT Protocol Adapter for Linux: Version 21.0.0.0.0 - Production
  Version 21.3.0.0.0
  Time: 14-JUN-2024 10:18:55
  Tracing not turned on. Process Id = 2516
  Tns error struct:
    ns main err code: 12541

TNS-12541: TNS:no listener
    ns secondary err code: 12560
    nt main err code: 511

TNS-00511: No listener
    nt secondary err code: 111
    nt OS err code: 0


***********************************************************************

But not DG Broker even though the status of the primary is Pending:

$ date
Fri Jun 14 12:41:20 CEST 2024

$ kubectl -n oracle-database get pods
NAME               READY   STATUS     RESTARTS   AGE
db11-nfwnl         0/1     Init:1/2   0          23m
db12-dwake         1/1     Running    0          54m

$ kubectl -n oracle-database get singleinstancedatabase
NAME         EDITION      STATUS    ROLE               VERSION      CONNECT STR                 TCPS CONNECT STR   OEM EXPRESS URL
db11         Enterprise   Pending   PRIMARY            21.3.0.0.0   10.1.1.161:32480/DB11       Unavailable        https://10.1.1.161:30473/em
db12         Enterprise   Healthy   PHYSICAL_STANDBY   21.3.0.0.0   10.1.2.200:30739/DB12       Unavailable        https://10.1.2.200:30875/em

$ kubectl -n oracle-database get dataguardbroker
NAME                 PRIMARY   STANDBYS   PROTECTION MODE   CONNECT STR                  STATUS
dataguardbroker-db   DB11      DB12       MaxAvailability   10.1.1.161:31036/DATAGUARD   Healthy

$ kubectl -n oracle-database describe dataguardbroker
Name:         dataguardbroker-db
Namespace:    oracle-database
Labels:       <none>
Annotations:  <none>
API Version:  database.oracle.com/v1alpha1
Kind:         DataguardBroker
Metadata:
  Creation Timestamp:  2024-06-14T09:55:59Z
  Finalizers:
    database.oracle.com/dataguardbrokerfinalizer
  Generation:        1
  Resource Version:  94229431
  UID:               d5703585-503c-46f6-be34-9a3bdb3a40df
Spec:
  Fast Start Fail Over:
  Primary Database Ref:     db11
  Protection Mode:          MaxAvailability
  Set As Primary Database:  DB11
  Standby Database Refs:
    db12
Status:
  Cluster Connect String:   dataguardbroker-db.oracle-database:1521/DATAGUARD
  External Connect String:  10.1.1.161:31036/DATAGUARD
  Primary Database:         DB11
  Primary Database Ref:     db11
  Protection Mode:          MaxAvailability
  Standby Databases:        DB12
  Status:                   Healthy
Events:
  Type    Reason                       Age   From             Message
  ----    ------                       ----  ----             -------
  Normal  DG Configuration up to date  45m   DataguardBroker

Setup: one primary singleinstancedatabase and one standby singleinstancedatabase, both using image enterprise:21.3.0.0. OraOperator version: 1.1.0.

Best regards, Andreas

andbos avatar Jun 14 '24 11:06 andbos