enrich icon indicating copy to clipboard operation
enrich copied to clipboard

derived_tstamp incorrect result with long delay between dvce_created_tstamp and dvce_sent_tstamp

Open mbehm opened this issue 3 years ago • 2 comments

Project: Stream Enrich Version: stream-enrich-1.4.2-common-1.4.2

Expected behavior: To have derived_tstamp be calculated correctly when the difference between dvce_created_tstamp and dvce_sent_tstamp is large (disregarding possible issues due to changes in time zone).

Actual behavior: I was running some processing of older data and noticed a weird page view with end_tstamp before start_tstamp, after tracking the issue down to a single page ping it seems the derived_tstamp is calculated incorrectly when the difference between dvce_created_tstamp and dvce_sent_tstamp is long. Here're the timestamp fields from the page ping event:

etl_tstamp          2021-04-09 04:34:26.254 UTC
collector_tstamp    2021-04-09 04:34:25.251 UTC
dvce_created_tstamp 2021-02-11 09:35:55.807 UTC
dvce_sent_tstamp    2021-04-09 04:34:24.785 UTC
derived_tstamp      2021-02-08 09:35:56.273 UTC

The derived_tstamp should be 2021-02-11 09:35:56.273 UTC but I'd assume due to way the Period class gets subtracted from the timestamp the output three days off in this case.

Steps to reproduce: Haven't been able to test myself yet since I don't have a Scala environment set up currently but I'd assume inputting those values into the getDerivedTimestamp method should result in the same erroneous timestamp.

mbehm avatar Sep 16 '21 14:09 mbehm

@mbehm - just to rule it out, is true_tstamp set for this event?

true_tstamp is used for derived tstamp when set. It doesn't seem likely that this is what's happening, but thought I'd check. :)

colmsnowplow avatar Sep 16 '21 14:09 colmsnowplow

Ah yeah forgot that from the timestamp fields, no true_tstamp isn't set.

mbehm avatar Sep 16 '21 14:09 mbehm