materialize icon indicating copy to clipboard operation
materialize copied to clipboard

Chunked columnar stack

Open antiguru opened this issue 1 year ago • 3 comments

Chunked and fixed-length columnar containers.

This PR tries to replace TimelyStack with a type that regions-allocates the local portion of the timely stack to allow it being paged to disk. Currently, TimelyStack only allows regions that explicitly get their allocations from lgalloc to be spilled to disk, but nothing else. The types introduced in the PR additionally allow the local portion of the stack to go to disk.

For this, it introduces several new types:

  • A FixedStack is similar to a TimelyStack, but can only absorb a fixed number of copies. We use it within the merge batcher.
  • A ChunkedStack is similar to a TimelyStack but stores the local portion in Array, moving the local vector to spillable memory.

Additionally, we upgrade lgalloc and cease using its Region implementation, so we inline the implementation into Materialize.

TODO: This PR has unknown performance implications, which we need to measure separately.

Motivation

Tips for reviewer

Checklist

  • [ ] This PR has adequate test coverage / QA involvement has been duly considered.
  • [ ] This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
  • [ ] If this PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way), then it is tagged with a T-proto label.
  • [ ] If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).
  • [ ] This PR includes the following user-facing behavior changes:

antiguru avatar Dec 15 '23 21:12 antiguru

I split the ColumnatedMergeBatcher changes from this PR, so it's much smaller.

antiguru avatar Feb 13 '24 18:02 antiguru

TODO: This PR has unknown performance implications, which we need to measure separately.

Are you still planning to do that?

teskje avatar Feb 14 '24 14:02 teskje

Feature benchmark is showing some memory regressions against main, is that expected? https://buildkite.com/materialize/nightlies/builds/6471

def- avatar Feb 15 '24 19:02 def-

I can observe similar regressions after a rebase in https://buildkite.com/materialize/nightlies/builds/6544. I'll look into it -- some of this is expected, but I'd like to confirm that before merging.

antiguru avatar Feb 20 '24 18:02 antiguru

I'll go ahead and merge this PR. The new chunked stack is feature-gated, so all tests still use the current implementation. I'm planning on enabling the new implementation in tests and staging next week.

antiguru avatar Feb 23 '24 22:02 antiguru