timescaledb icon indicating copy to clipboard operation
timescaledb copied to clipboard

[WIP] CAGG on top of another CAGG

Open fabriziomello opened this issue 1 year ago • 2 comments

Closes #1400

fabriziomello avatar Sep 01 '22 23:09 fabriziomello

Codecov Report

Merging #4668 (9fbaee3) into main (f153566) will increase coverage by 27.19%. The diff coverage is 90.10%.

Impacted file tree graph

@@             Coverage Diff             @@
##             main    #4668       +/-   ##
===========================================
+ Coverage   63.05%   90.25%   +27.19%     
===========================================
  Files         226      226               
  Lines       44692    50094     +5402     
===========================================
+ Hits        28182    45213    +17031     
+ Misses      16510     4881    -11629     
Impacted Files Coverage Δ
src/ts_catalog/catalog.c 85.91% <ø> (+68.45%) :arrow_up:
src/ts_catalog/catalog.h 100.00% <ø> (ø)
src/ts_catalog/continuous_agg.c 94.45% <71.42%> (+75.12%) :arrow_up:
src/telemetry/telemetry.c 85.94% <89.78%> (+80.55%) :arrow_up:
tsl/src/continuous_aggs/create.c 88.06% <93.54%> (+76.76%) :arrow_up:
src/bgw/job_stat.c 92.18% <100.00%> (+83.55%) :arrow_up:
src/loader/loader.c 94.89% <100.00%> (+0.18%) :arrow_up:
tsl/test/src/test_chunk_stats.c 48.14% <0.00%> (-51.86%) :arrow_down:
tsl/src/nodes/compress_dml/compress_dml.c 84.78% <0.00%> (-15.22%) :arrow_down:
src/uuid.c 78.57% <0.00%> (-6.05%) :arrow_down:
... and 208 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 1847f64...9fbaee3. Read the comment docs.

codecov[bot] avatar Sep 01 '22 23:09 codecov[bot]

Yeyyy

Inviz avatar Sep 07 '22 17:09 Inviz

It's amazing that the actual code has a pretty small surface area for this feature.

Inviz avatar Oct 03 '22 02:10 Inviz

As per discussed in pvt with @gayyappan this PR lack of some important validations:

  1. bucket width of the nested cagg should be multiple of the parent cagg bucket width
  2. prevent users create a real-time cagg over a materialized only cagg
  3. when change a cagg from materialized only to real-time propagate this change to all nested caggs
  4. prevent users create a nested cagg with bucket width less than the parent cagg bucket width

/cc @horzsolt @iroussos

fabriziomello avatar Oct 29 '22 15:10 fabriziomello

While I appreciate the concerns from @fabriziomello and @gayyappan, and I agree that some of these are weird edge cases, I think that these checks are entirely unnecessary and potentially counterproductive and certainly will slow down our development. I'd suggest that we can document some of these as potential issues, but I don't think we should spend time trying to prevent odd cases that any user can run into no matter what.

My basic philosophy on this is that these views should be treated essentially as separate entities and the behavior should be predictable based on that, we should not emphasize / interlink or try to plan for all the weird ways that someone can cause them to interact weirdly. Users can already do this with a normal view, and we don't stop them there, that's on them, and it should be here too.

As per discussed in pvt with @gayyappan this PR lack of some important validations:

  1. bucket width of the nested cagg should be multiple of the parent cagg bucket width

A) I'm not sure that this is possible or reasonable in a number of cases, for instance, can we actually enforce this with anything involving time zones? What's the test? Is a week a multiple of a day? What about a month being a multiple of 7 minutes? B) fundamentally, a time_bucket produces a new timestamp. If you want to re-bucket that imperfectly, I don't think we should stop you. There may be many perfectly valid reasons why a user would want to do that, including that for many, it's "close enough" ie the 7 minutes into a month thing...I'd probably be okay with bucketing it even if technically there was a little slosh at the edges, it'd even out over time. We don't stop users from doing this in queries now, why would we stop them here? C) This doesn't impact invalidations at all. The underlying table gets invalidations, then the first materialization gets invalidations, and they will all propagate properly up the chain expanding to the size of the bucket (assuming we've done our work correctly), so there's no special case where you're going to be "more wrong" here than if you just do this in a query. D) I do think that we should have a note in the docs saying something like: We recommend that smaller buckets divide evenly into larger buckets give a demonstration of why this could be a problem and then say, but we do not enforce this, it is up to you as the user to understand the implications of doing or not doing this.

  1. prevent users create a real-time cagg over a materialized only cagg

Why would we prevent this? There are a number of valid use cases where I would totally want the behavior that this provides. For instance, I have a big hypertable that I know isn't that performant if I do real time agg on it, so I create a frequently refreshed, say, 1 minute cagg on it, and then I do larger bucketed caggs over that. I want the real time view for those larger bucketed caggs because I want to see the new results ASAP once the 1 minute view gets refreshed, but I know that if I hit the hypertable my performance suffers too much, so I create a real time view on the larger bucketed ones and keep the 1 minute one materialized only.

  1. when change a cagg from materialized only to real-time propagate this change to all nested caggs

This would be a horrible, horrible user experience for anyone in the situation above, and I don't know why we would do this.

It seems far, far easier for a user to reason about this on a per view basis rather than have side effects. The views are functionally independent, all we're doing is creating another cagg on top of the view. That is far, far easier to understand than if I start creating implicit linkages.

  1. prevent users create a nested cagg with bucket width less than the parent cagg bucket width

Why? because it's silly? I agree it's silly, but like, we can't keep everyone from doing all the silly things, that's too much work. If you do this, you'll find that it has the same number of rows as the original cagg. Okay. What's the big deal? Are they going to lose data or will something horrible happen?

My basic philosophy on most of this is that we should avoid doing too much to prevent users from shooting themselves in the foot, at least when the injury will not be that grave. None of the scenarios here will cause data loss or irrecoverable failures, so why spend the time both constraining user behavior unnecessarily (which means they can't make tradeoffs that we might not have considered) and also slowing down our development considerably. Trying to prevent all this stuff means so much more testing and development on our end and I really don't think we're going to cover all the cases. And the benefit to the user is so minimal.

davidkohn88 avatar Oct 29 '22 16:10 davidkohn88