kartothek
kartothek copied to clipboard
Allow delete logic to be specified using predicate syntax
We currently only support the deprecated query syntax for deletion scopes. It would be more intuitive to specify the deletion scope using the predicate syntax.
Old syntax
update_dataset_from_ddf(
new_ddf,
...,
delete_scope=[
{
"index_col": 1,
},
{
"index_col": 2,
},
{
"index_col": 3,
}
]
)
New syntax
update_dataset_from_ddf(
new_ddf,
...,
delete=[[
("index_col", ">=", 1),
("index_col", "<=", 3),
]]
)
This new implementation would also allow us to better define the semantics for scenarios where an index doesn't exist (which currently doesn't raise) or a secondary index is used (which currently may raise under certain circumstances)