tempo icon indicating copy to clipboard operation
tempo copied to clipboard

Remove partition columns from Z Order optimization in `io.py`

Open R7L208 opened this issue 3 years ago • 2 comments

In io.py we Z ORDER on partitionCols + optimizationCols when useDeltaOpt is True. Since we can partition prune without Z Ordering on partition columns, I believe it makes sense to remove them from the Z Order clause to only optimize on optimizationCols if they are provided.

Is there another advantage to including partition columns within Z ORDER for time series other than data skipping?

R7L208 avatar Aug 15 '22 15:08 R7L208

We should be able to remove partition columns.

On Mon, Aug 15, 2022 at 11:41 AM Lorin Dawson @.***> wrote:

In io.py we Z ORDER on partitionCols + optimizationCols when useDeltaOpt is True. Since we can partition prune without Z Ordering on partition columns, I believe it makes sense to remove them from the Z Order clause to only optimize on optimizationCols if they are provided.

Is there another advantage to including partition columns within Z ORDER for time series other than data skipping?

— Reply to this email directly, view it on GitHub https://github.com/databrickslabs/tempo/issues/244, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJCRAXABDIJ75JM6UKPSXWLVZJQLLANCNFSM56SUGQBQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>

--

Ricardo Portilla

Industry Vertical Lead - Financial Services, Ph.D

Databricks Inc.

@.***

databricks.com

rportilla-databricks avatar Oct 11 '22 08:10 rportilla-databricks

This will be resolved when streaming AS OF joins are merged.

rportilla-databricks avatar Apr 25 '23 00:04 rportilla-databricks