cbl-dart icon indicating copy to clipboard operation
cbl-dart copied to clipboard

Unwanted single / double precision conversion during replication from CBL to CB server

Open nblum37 opened this issue 1 year ago • 12 comments

Describe the bug

During the replication (I currently assume the issue there), double values might be converted if they were single precision before (see the example). This breaks hashes that were calculated from the document.

To Reproduce and expected behavior

Steps to reproduce the behavior:

  1. Create a document at CBL with the following content
{
  "test_double": 307.79998779296875
}
  1. Replicate this document to the CB server where it will be shown as
{
  "test_double": 307.8
}
  1. If the CBL document is removed (only from CBL) and replicated back from the CB server to CBL the document will be retrieved as
{
  "test_double": 307.8
}

This behavior only happens for special double values. A value like e.g. 123.45 or 307.7999877929687 would not cause any issue. A value like 307.799987792968750 or any other decimal extension seems to cause the issue, as well. It is worth mentioning, that 307.79998779296875 is the single precision representation of the value 307.8.

If I manually change back the value from 307.8 to 307.79998779296875 at the CB server web interface the correct value will be transferred to CBL. Also, the calculation of the document hash is successful. So in general, it seems that the value representation is supported, but something is changed during the replicator push process whereas the replicator pull process seems to be fine. Maybe in the C++ framework, there might be some issue with float vs. double.

Environment (please complete the following information):

  • OS: Android / iOS
  • cbl: 2.2.2 / 3.0.0-dev.3
  • cbl_flutter: 2.0.9 / 3.0.0-dev.3
  • cbl_flutter_ce: 2.2.2 / 3.0.0-dev.3

nblum37 avatar Jan 18 '24 00:01 nblum37

Thanks for raising this issue and the provided details! I'll look into it.

blaugold avatar Jan 18 '24 13:01 blaugold

Thank you :)

nblum37 avatar Jan 18 '24 13:01 nblum37

It seems like a bug in the float formatting function that is used by Couchbase Lite to encode JSON. I've reported the problem in a forum post: https://www.couchbase.com/forums/t/unwanted-single-double-precision-conversion-during-replication-from-cbl-to-cb-server/37797

blaugold avatar Jan 20 '24 09:01 blaugold

Couchbase Lite / Fleece architect here. I suspect that step 1, "create a document at CBL with the following content..." is storing the number in Float format (32-bit IEEE) not Double (64-bit IEEE.) That causes all but about six digits of the number to be lost since 32-bit floats can only store about six digits of precision. When Fleece converts the document to JSON, the 32-bit float value will convert to decimal as 307.800, and Fleece suppresses the unnecessary zeroes.

The fix is for Dart to encode the number as a 64-bit Double.

snej avatar Jan 22 '24 17:01 snej

Hey @snej, thanks for taking a look! My understanding of floating-point number does not go all that deep, so I appreciate your input.

As far as I understand, 307.79998779296875 is representable as a 32-bit float. I tried to verify this with the following program:

#include <stdio.h>

int main() {
  double d = 307.79998779296875;
  float f = d;
  printf("d: %.20f\n", d);
  printf("f: %.20f\n", f);
  printf("d == f: %d\n", (float)d == f);
  return 0;
}

It prints:

d: 307.79998779296875000000
f: 307.79998779296875000000
d == f: 1

Am I missing something here?

The only floating-point type supported by dart is double which is 64 bits wide, and the Couchbase Lite SDK for Dart currently only uses the Double APIs of the Fleece API (FLEncoder_WriteDouble and FLSlot_SetDouble).

The following Dart program shows that the number can be saved to and retrieved from the database as expected. But when using the Fleece provided JSON encoder (toJson does that) yields the unexpected change of the value. toPlainMap converts the document to a data structure of simple Dart built-in objects.

import 'package:cbl/cbl.dart';
import 'package:cbl_dart/cbl_dart.dart';

Future<void> main(List<String> args) async {
  await CouchbaseLiteDart.init(edition: Edition.enterprise);

  final db = Database.openSync('a');

  final doc = MutableDocument.withId('a', {
    'double': 307.8,
    'single': 307.79998779296875,
  });
  db.saveDocument(doc);

  print('toJson: ${db.document('a')!.toJson()}');
  print('toPlainMap: ${db.document('a')!.toPlainMap()}');

  await db.close();
}

This is the output of the above program:

toJson: {"double":307.8,"single":307.8}
toPlainMap: {double: 307.8, single: 307.79998779296875}

While trying to understand this issue, I looked at the implementations of FLEncoder_WriteDouble and FLSlot_SetDouble. They use Encoder::isFloatRepresentable on the given value and write a float instead of a double if the value is representable as a float. I have the feeling that could be relevant here.

blaugold avatar Jan 22 '24 20:01 blaugold

As far as I understand, 307.79998779296875 is representable as a 32-bit float

It is, but it's identical to float(307.8). The reason you get that long expansion in decimal is because it’s a binary number and 0.8 doesn’t have an exact binary representation. When converting a floating point number to decimal Fleece rounds it to the nearest decimal that has the same representation, namely 307.8. This doesn’t lose any information. (The code for this comes from the Swift standard lib, btw.)

I will look at the Encoder code you pointed to tomorrow when I’m at my computer.

snej avatar Jan 24 '24 04:01 snej

I think I see what's going on here:

  1. You call FLEncoder_WriteDouble(307.79998779296875).
  2. The implementation checks whether that number has an exact representation as a float; it does.
  3. The encoder writes it as a 32-bit float to save room.
  4. Later on, the encoded value gets converted to a string as part of JSON encoding.
  5. The FP-to-string converter keeps 6 digits of precision because it's given a 32-bit float.
  6. Output is "307.8".

None of the individual steps are wrong, but in step 5 the conversion should be using 15 digits of precision because the number was originally a double. It doesn't know this; the fact that it used to be a double was lost in step 3.

snej avatar Jan 30 '24 23:01 snej

Thanks for the update and great to see that there is already a fix!

blaugold avatar Jan 31 '24 13:01 blaugold

This was a really interesting bug! I'm glad I circled back to this issue and thought about it some more, else I wouldn't have realized it was a bug and not correct behavior.

Incidentally, the fix marks the first change to the Fleece binary format since 2018.

snej avatar Jan 31 '24 17:01 snej

Thank you very much you two for diving into this issue. @blaugold are there any plans for when this fix might be available in the next CBL Dev release?

nblum37 avatar Feb 06 '24 09:02 nblum37

I think Couchbase released a new CBL on C version last month. Maybe it contains the latest Fleece changes? :)

nblum37 avatar Apr 10 '24 20:04 nblum37

@nblum37 I'm in the process of pulling in the CBL C SDK 3.1.6, but unfortunately the fix for this issue did not make it into that release.

@snej Any idea when the fix in Fleece will make it downstream into the CBL C SDK?

blaugold avatar Apr 18 '24 12:04 blaugold