TileDB icon indicating copy to clipboard operation
TileDB copied to clipboard

Consolidate query condition value and size into UntypedDatumView

Open lums658 opened this issue 3 years ago • 3 comments
trafficstars

(From 15762). Replace the use of pairs of void* and uint64_t (pointer and size) to demarcate data with a single UntypedDatumView object. Replaced the void* condition_value_ in Clause with UntypedDatumView condition_value_view_. This will point to condition_value_data_.data() when not a nullptr and also reflect condition_value_data_.size(). The view is used instead of condition_value_data_ because we need to be able to represent nullptr in the clauses (for nullable attribute types). All comparison functions in query_condition.cpp that previously used pointer-size for arguments now take a const reference to an UntypedDatumView. std::span may work equally well - UntypedDatumView was used for simplicity.


TYPE: IMPROVEMENT DESC: Replace the use of pairs of void* and uint64_t (pointer and size) to demarcate data with UntypedDatumView.

lums658 avatar Mar 31 '22 21:03 lums658

A couple things. From a performance standpoint, this looks equivalent from a first glance to the current implementation but just to make sure, we should run a couple customer scenarios. @lums658, I know Seth has a few cases he's been running. Also, I'm starting to see more changes to the query condition and the code growing, I think it might be time to split into 4 classes, a base if necessary, one for apply, apply_sparse and apply_dense. Might also be a good time to move query_condition to its own folder as the query folder is getting quite large. For this PR though, only the first item is required.

KiterLuc avatar Apr 01 '22 13:04 KiterLuc

A couple things. From a performance standpoint, this looks equivalent from a first glance to the current implementation but just to make sure, we should run a couple customer scenarios. @lums658, I know Seth has a few cases he's been running. Also, I'm starting to see more changes to the query condition and the code growing, I think it might be time to split into 4 classes, a base if necessary, one for apply, apply_sparse and apply_dense. Might also be a good time to move query_condition to its own folder as the query folder is getting quite large. For this PR though, only the first item is required.

KiterLuc avatar Apr 01 '22 13:04 KiterLuc

Closing as addressed in subsequent PRs. ASTNode now holds value as UntypedDatumView: https://github.com/TileDB-Inc/TileDB/blob/73e990544543e9633b2400c93b6b36c53e7c2592/tiledb/sm/query/ast/query_ast.h#L381

ihnorton avatar Aug 17 '22 18:08 ihnorton

Totally agree that we should do some benchmarking.

On Apr 1, 2022, at 6:20 AM, KiterLuc @.@.>> wrote:

A couple things. From a performance standpoint, this looks equivalent from a first glance to the current implementation but just to make sure, we should run a couple customer scenarios. @lums658https://github.com/lums658, I know Seth has a few cases he's been running. Also, I'm starting to see more changes to the query condition and the code growing, I think it might be time to split into 4 classes, a base if necessary, one for apply, apply_sparse and apply_dense. Might also be a good time to move query_condition to its own folder as the query folder is getting quite large. For this PR though, only the first item is required.

— Reply to this email directly, view it on GitHubhttps://github.com/TileDB-Inc/TileDB/pull/3040#issuecomment-1085892894, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AGKPFDP4O2RUWSKX22HPYX3VC3Z2LANCNFSM5SGZ7BNQ. You are receiving this because you were mentioned.Message ID: @.***>

lums658 avatar Oct 11 '22 07:10 lums658