Trill
Trill copied to clipboard
Ingress data policy: forward-looking outliers
There are already data policies at ingress for data that arrives "late". We can drop, adjust, or throw when data arrives late, and we can hold data in reserve for a certain period of time to allow some reordering.
However, if a data point arrives "too early" we do not have a way to deal with it currently. For instance, if the current data time is X, and the next data point arrives with a timestamp of X + 2 days, this may be a result of:
- A spurious data value whose timestamp is garbled
- Some data that has arrived well ahead of other valid data
Today, however, we accept this value as valid and current, and the sync time is advance all the way to X + 2 days. Any further data will now be compared against the new sync time, and thus data may come start to get dropped or adjusted improperly.
We would like to add an ingress policy that allows for a threshold to be specified for maximum sync time advancement. If a data value arrives so far into the future that the maximum advancement is exceeded, the value is either:
- Dropped
- Adjusted to current sync time + maximum delta
- Throw
- Held in a queue until time advances
If this issue is still open, I can take this up