data-prepper icon indicating copy to clipboard operation
data-prepper copied to clipboard

[BUG] DynamoDB source export converts Numbers ending in 0 to scientific notation

Open graytaylor0 opened this issue 2 years ago • 1 comments

Describe the bug Given an item in with a Number type ending in 0, such as 1702062202420, the DynamoDB source will convert it to scientific notation for export items.

{"pk": "my_partition_key", "sk":"my_sort_key", "my_number_ending_in_0": 702062202420 }

To Reproduce Steps to reproduce the behavior:

  1. Create a pipeline with a dynamodb source with export and an opensearch
  2. Once the export is complete, the document will be sent to the OpenSearch sink as the following with the my_number_ending_in_0 key converted to scientific notation
{"pk": "my_partition_key", "sk":"my_sort_key", "my_number_ending_in_0": 1.70206220242E+12 }

Expected behavior The Numbers ending in 0 should not be manipulated and the above example should result in

{"pk": "my_partition_key", "sk":"my_sort_key", "my_number_ending_in_0": 702062202420 }

Additional context The conversion only happens for export values when converting from the ion line here (https://github.com/opensearch-project/data-prepper/blob/91ff22d6da2b14d8a27ade89ee516341181c8bd6/data-prepper-plugins/dynamodb-source/src/main/java/org/opensearch/dataprepper/plugins/source/dynamodb/converter/ExportRecordConverter.java#L82), but Data Prepper JacksonEvent also converts to scientific notation when converting to json string (https://github.com/opensearch-project/data-prepper/blob/91ff22d6da2b14d8a27ade89ee516341181c8bd6/data-prepper-api/src/main/java/org/opensearch/dataprepper/model/event/JacksonEvent.java#L621). I initially had created a custom deserializer that iterated over and converted all decimals of this format to not use scientific notation, however this may not be the best approach

graytaylor0 avatar Dec 08 '23 21:12 graytaylor0

I can work this

san81 avatar May 21 '24 20:05 san81

As part of this change, we provided an option for the user to convert the decimal number to BigDecimal of his own chosen scale (or precision) which should help handle numbers ending in 0 not to get converted into scientific notation. PR notes in the below link has the details about about how to use this and what kind of flexibility user has to tune his needs. Hence marking this ticket as fixed.

https://github.com/opensearch-project/data-prepper/pull/4557

san81 avatar Jul 29 '24 19:07 san81

The solution to provide a scale is useful, but ideally, the dynamodb source can automatically handle this situation so that customers do not have to add the manual conversions.

dlvenable avatar Sep 06 '24 15:09 dlvenable

[Catch All Triage - 1, 2, 3, 4]

dblock avatar Sep 09 '24 16:09 dblock

I did run into similar issue. Basically my Dynamodb table record had a field called created_time with value: 1733979067700. My requirement was to store created_time field in opensearch index with date type and epoch_millis format. When this value was exported and ingested to opensearch I was getting the scientific notation format error.

To resolve this I added a processor to the pipeline config which directly converts scientific notation format number to long type. Initially I tried using big_decimal as suggested by @san81 but it was not needed if the scientific notation format number is within the range of long data type.

processor:
          - convert_type:
              key: "/_source/created_time"
              type: "long"

Rishikesh1159 avatar Feb 08 '25 19:02 Rishikesh1159