velox Add recursive spill for RowNumber

At present, RowNumber only supports single-level spill partitions. Once a spill takes place, it won't be spilled again. This means that even if a restored spill partition uses an excessive amount of memory, it still cannot be reclaimed. Therefore, it would be beneficial to implement recursive spill support, as outlined in https://facebookincubator.github.io/velox/develop/spilling.html.

1 Advance 'spillPartitionBits_' based on the partition bit of the currently restoring spill partition to prepare for potential subsequent spills (recursive).

2 During the recursive spill input process, read the spilled input data from the previous spill run via 'spillInputReader_', then spill it back into several sub-partitions. Then restore one of the newly spilled partitions and reset 'spillInputReader' accordingly.

3 It's crucial to separate the recursive input spill process from the spill process itself, as it can be time-consuming and should yield if it exceeds the CPU time limit.

In summary, the first spill operation writes all data to the disk. Subsequently, RowNumber reads the data from the disk one partition at a time and clears the data from the restored partition once it has been processed (#8413). In the case of a recursive spill, only the data from the currently restored partition in memory is spilled to the disk (next level), and only one partition from the next level is loaded back into memory.

The data layout may look like as follows.

	Memory	Disk
No-Spill	[P0, P1]	[]
First Spill	[]	[P0, P1]
Restore	[P0]	[P1]
Second Spill	[]	[P0-0, P0-1, P1]
Restore	[P0-0]	[P0-1, P1]
Processed	[]	[P0-1, P1]
Restored	[P0-1]	[P1]
Processed	[]	[P1]
Restored	[P1]	[]

Feb 02 '24 08:02 duanmeng

Deploy Preview for meta-velox canceled.

Name	Link
Latest commit	4e09fffeefa9b327bbe1aacf22298fecc991debd
Latest deploy log	https://app.netlify.com/sites/meta-velox/deploys/6616be4955a08000085fb9d9

Feb 02 '24 08:02 netlify[bot]

@xiaoxmeng has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Mar 04 '24 16:03 facebook-github-bot

@xiaoxmeng Hi Xuan, I've made a small refactor based on your newly simplified RowNumber spill implementation. Could you please help to take another look? Thanks.

Mar 14 '24 02:03 duanmeng

@xiaoxmeng has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Mar 29 '24 06:03 facebook-github-bot

@bikramSingh91 Hi Bikram, any further comments? Thanks.

Apr 10 '24 08:04 duanmeng

@mbasmanova Hi Masha, all the prerequisite PRs have been merged, could you please help take another look? Thanks.

Apr 10 '24 08:04 duanmeng

@duanmeng Thank you for adding "The data layout may look like as follows." section to PR description. It is very helpful.

Spilling is rather complicated and we are seeing many bugs in it. I wonder how much testing have been done for this change and what's the confidence of it being bug free. I don't believe we have fuzzer coverage for RowNumber operator. Would that be necessary to ensure there are no (or very few) bugs?

Also, I wonder what's the motivation for this change. Is there a real-world scenario that benefits from this change and justifies added complexity and maintenance costs. Would you describe that scenario? In this scenario, what's the impact on latency and CPU usage of spilling? How much CPU and wall time a query would use without spilling and how does that compare to spilling?

Apr 10 '24 12:04 mbasmanova

velox velox copied to clipboard

Add recursive spill for RowNumber

✅ Deploy Preview for meta-velox canceled.

velox
velox copied to clipboard

Deploy Preview for meta-velox canceled.