Conjunction operators validation
In datachain filter method we can use conjunction operators as following:
-
ANDoperation:.filter(C("foo") == "bar", C("baz") == "qux").filter((C("foo") == "bar") & (C("baz") == "qux")).filter(datachain.func.and_(C("foo") == "bar"), (C("baz") == "qux")) -
ORoperation:.filter((C("foo") == "bar") | (C("baz") == "qux")).filter(datachain.func.or_(C("foo") == "bar"), (C("baz") == "qux")) -
NOToperation:.filter(~(C("foo") == "bar"))NOTE:
datachain.func.not_is not implemented, need to implement.
Since we are using Python language, sometimes it feels natural to use python logical operators:
.filter(C("foo") == "bar" and C("baz") == "qux")
.filter(C("foo") == "bar" or C("baz") == "qux")
.filter(!C("foo"))
but that isn’t realistically possible to support this syntax.
Same time, we allow this syntax and in the end user query works not as expected. Sometimes (for complex chains) it is hard to notice that and to figure it out why results are wrong.
Suggestion
Should we may be check input param type for filter chain method (need to also check all other methods like mutate), and fail or fire a warning if it is bool type?
Are there other better options?
As a first step let's put this summary into the filter docs. Thanks @dreadatour for creating the ticket.
As a first step let's put this summary into the filter
docs.
I'll do this later 👍
As a first step let's put this summary into the filter
docs.
https://github.com/iterative/datachain/pull/1151