enrich
enrich copied to clipboard
derived_tstamp incorrect result with long delay between dvce_created_tstamp and dvce_sent_tstamp
Project: Stream Enrich Version: stream-enrich-1.4.2-common-1.4.2
Expected behavior:
To have derived_tstamp
be calculated correctly when the difference between dvce_created_tstamp
and dvce_sent_tstamp
is large (disregarding possible issues due to changes in time zone).
Actual behavior:
I was running some processing of older data and noticed a weird page view with end_tstamp
before start_tstamp
, after tracking the issue down to a single page ping it seems the derived_tstamp
is calculated incorrectly when the difference between dvce_created_tstamp
and dvce_sent_tstamp
is long. Here're the timestamp fields from the page ping event:
etl_tstamp 2021-04-09 04:34:26.254 UTC
collector_tstamp 2021-04-09 04:34:25.251 UTC
dvce_created_tstamp 2021-02-11 09:35:55.807 UTC
dvce_sent_tstamp 2021-04-09 04:34:24.785 UTC
derived_tstamp 2021-02-08 09:35:56.273 UTC
The derived_tstamp
should be 2021-02-11 09:35:56.273 UTC
but I'd assume due to way the Period
class gets subtracted from the timestamp the output three days off in this case.
Steps to reproduce:
Haven't been able to test myself yet since I don't have a Scala environment set up currently but I'd assume inputting those values into the getDerivedTimestamp
method should result in the same erroneous timestamp.
@mbehm - just to rule it out, is true_tstamp
set for this event?
true_tstamp
is used for derived tstamp when set. It doesn't seem likely that this is what's happening, but thought I'd check. :)
Ah yeah forgot that from the timestamp fields, no true_tstamp
isn't set.