foundationdb icon indicating copy to clipboard operation
foundationdb copied to clipboard

DR fails atomicity in 7.3.43

Open doublex opened this issue 1 year ago • 5 comments

After updating FDB from 7.3.37 to 7.3.43 the disaster-recovery fails atomicity.

What we did:

fdbdr start --source /path/to/src.cluster --destination /path/to/dst.cluster
[wait until the DR is a complete copy of the primary database]
fdbdr abort --source /path/to/src.cluster --destination /path/to/dst.cluster

Older transactions are fine - but newer transactions are not atomic on the DR-clone. For example: The secondary-index is there without a primary record.

If I remember correctly this did not happen with 7.3.37. Is it safe to downgrade FDB to 7.3.37? If so, I could check.

Best wishes

doublex avatar May 30 '24 09:05 doublex

fdbcli --exec status (primary database)

Using cluster file `/etc/foundationdb/fdb.cluster'.

Configuration:
  Redundancy mode        - single
  Storage engine         - ssd-2
  Log engine             - ssd-2
  Encryption at-rest     - disabled
  Coordinators           - 1
  Desired Commit Proxies - 3
  Desired GRV Proxies    - 1
  Desired Resolvers      - 1
  Desired Logs           - 3
  Usable Regions         - 1

Cluster:
  FoundationDB processes - 1
  Zones                  - 1
  Machines               - 1
  Memory availability    - 24.0 GB per process on machine with least available
  Fault Tolerance        - 0 machines
  Server time            - 05/30/24 11:11:51

Data:
  Replication health     - Healthy
  Moving data            - 0.000 GB
  Sum of key-value sizes - 134.925 GB
  Disk space used        - 168.328 GB

Operating space:
  Storage server         - 735.9 GB free on most full server
  Log server             - 735.9 GB free on most full server

Workload:
  Read rate              - 1156 Hz
  Write rate             - 116 Hz
  Transactions started   - 774 Hz
  Transactions committed - 41 Hz
  Conflict rate          - 0 Hz

Backup and DR:
  Running backups        - 0
  Running DRs            - 1 as primary

Client time: 05/30/24 11:11:51

doublex avatar May 30 '24 09:05 doublex

Is it safe to downgrade FDB to 7.3.37? If so, I could check.

Yes.

jzhou77 avatar May 30 '24 17:05 jzhou77

@jzhou77 Thanks. v7.3.37 works perfectly. I will upgrade sender/recipient up to v7.3.43 to determine the regression. That's probably all I can contribute.

doublex avatar Jun 02 '24 18:06 doublex

fdb_dr keeps atomicity if the primary-database ("sender") is not v7.3.43. v7.3.37 -> v7.3.37: works (transactions are atomic) v7.3.37 -> v7.3.43: works v7.3.41 -> v7.3.43: works v7.3.43 -> v7.3.43: secondary-indexes written, primary records missing (some)

Is there anything I can do to help?

doublex avatar Jun 05 '24 08:06 doublex

Wonder if this is a bug fixed by https://github.com/apple/foundationdb/pull/12037, as I didn't find any change between 7.3.37 and 7.3.43 that could cause the problem.

jzhou77 avatar Mar 18 '25 17:03 jzhou77