carbon
carbon copied to clipboard
Allow the aggregator to consistently accept past data
The discovery of #956 was due to replaying a huge swathe of past data into our aggregator instance and finding the data full of holes. I'd assumed it was due to early-expiry of old IntervalBuffer objects, but after writing this fix I found this wasn't the case. However, the behaviour is still likely incorrect when data is played over an aggregation interval. Suppose the aggregator config is:
a (60) = sum a
And the input is:
a 0 1
...
a 1 19
<aggregation timeout triggers here at time (MAX_AGGREGATION_INTERVALS + 1) * 60 or higher>
a 1 20
...
a 1 59
Then the aggregation will aggregate seconds 0-29 and output this (value 20) to the cache, then expire the interval as it's too old, then in the next timeout, aggregate seconds 31-60, output this to the cache and expire the interval again. The final value is the aggregate of seconds 20-59, which is 40 instead of the expected 60. There is no way to control when aggregation occurs, so replayed data will be inconsistent, and in this case could be any value between 1 and 60, though if the replay is faster than real-time, the most likely will be 60.
After these changes, we allow the system to keep the newest interval prior to the current time-frame, provided it's been submitted within the current time-frame. This means if old data is submitted in-order, as would be expected from a replay, the intervals will always be aggregated in a complete state at least once before expiry. This is somewhat robust to curiosities in the timestamps of the real-time submitting process (for example, if that process happens to round time down, or be pushing a prediction of future data or similar), as the number of kept intervals is based on submission time rather than interval time.
Push just reset to preferred email.
@bucko909 : linter says that current_interval is not in use, could you please check? Ignore Deepsource, it should be removed
Yep, I think it was just a copy/paste error. Apologies for that.
:warning: Please install the to ensure uploads and comments are reliably processed by Codecov.
Codecov Report
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 50.68%. Comparing base (
fdc56f6) to head (7ea6ce9). Report is 25 commits behind head on master.
:exclamation: Your organization needs to install the Codecov GitHub app to enable full functionality.
Additional details and impacted files
@@ Coverage Diff @@
## master #958 +/- ##
==========================================
+ Coverage 50.63% 50.68% +0.04%
==========================================
Files 36 36
Lines 3446 3453 +7
Branches 535 527 -8
==========================================
+ Hits 1745 1750 +5
- Misses 1574 1576 +2
Partials 127 127
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Another great addition, thanks, @bucko909 !