ar != None unexpected behaviour
Using ar != None results in an unexpected mask
Consider:
a = ak.Array([None, 2, None, 2, None])
b = np.array([None, 2, None, 2, None])
a, b
>>> (<Array [None, 2, None, 2, None] type='5 * ?int64'>,
array([None, 2, None, 2, None], dtype=object))
a != None
>>> <Array [None, True, None, True, None] type='5 * ?bool'>
b != None
>>> array([False, True, False, True, False])
Expected behaviour:
a != None
>>><Array [False, True, False, True, False] type='5 * ?bool'>
such that
a[a != None]
>>> <Array [2, 2] type='2 * ?int64'>,
There's a whole discussion of this issue, starting at https://github.com/scikit-hep/awkward-1.0/issues/490#issuecomment-712250246
When I wrote it there, I thought I was narrowly saving it from oblivion because I only remembered hearing it on Slack. I could have just pointed to this issue. Anyway, the cross-reference will be useful in solving it.
(I'm going through old issues, deciding what to do with them.)
I think that the array == None and array != None behavior should not be changed. In the examples you presented, @andrzejnovak, the results look non-intuitive, but they're following a rule that applies to all mathematical functions, including == and !=, and I think it would be dangerous to have exceptions to that behavior. If we made
ak.Array([1, 2, None, 3, None, 4]) == None
return
ak.Array([False, False, True, False, True, False])
then someone else's use-case might break because they were assuming that == and != would act like all other mathematical functions. Namely,
ak.Array([1, 2, None, 3, None, 4]) + 10
returns
ak.Array([11, 12, None, 13, None, 14])
That is, the scalar 10 broadcasts and the None values pass through any mathematical operation. When applied to == (same for !=), the expected result of
ak.Array([1, 2, None, 3, None, 4]) == None
would be
ak.Array([False, False, None, False, None, False])
because each integer == None is False and each missing value passes through.
@agoose77, do you concur? If you think there is something that we should do that would benefit all use-cases, reopen the issue. Thanks!
@jpivarski I had the same thought. If we allow x == None, then we run into the problem that x == None has a different type to x == 10 (logically, unless we used an UnmaskedArray to keep the option type. Then, what about x == [None]? Maybe we'd allow that, but then what about x == [None] vs x == [None, 1]? It seems like a can of worms!