typeshed icon indicating copy to clipboard operation
typeshed copied to clipboard

`__eq__`'s type hint for primitives accept `object` while in many cases it returns `NotImplemented`

Open mmohaveri opened this issue 2 years ago • 8 comments

As can be seen in builtins.pyi, all primitives are defining their __eq__ and __ne__ functions in the following manner:

    def __eq__(self, __x: object) -> bool: ...
    def __ne__(self, __x: object) -> bool: ...

This will cause incorrect type errors in cases where the second operand has overloaded __eq__ or __ne__ in a way that it does not return bool. Consider the following example:

T = TypeVar("T", float, bool)

class Matrix(Generic[T]):
  def __init__(self, data: List[List[T]]):
    self.data = data

  def __eq__(self, other: "Matrix[T] | float | bool") -> Matrix[bool]:
    if not isinstance(other, Matrix):
      return Matrix([[elem == other for elem in row] for row in self.data])
    return Matrix((elem == other_elem for elem, other_elem in zip(row1, row2)] for row1, row2 in zip(self.data, other.data)]

  def __ne__(self, other: "Matrix[T] | float | bool") -> Matrix[bool]: ...

Here we're trying to create a class that mimics numpy's NDArray in a very limited way.

Now if I try to check the equality of a Matrix with a float, the return type will be inferred as Matrix, since LHS of the operation has a matching __eq__ method which returns matrix (and the returned type in runtime is Matrix as well)

But if I try to check the equality of a float with a Matrix, the returned type will be inferred as bool because LHS of the operation has a compatible __eq__ (def __eq__(self, __x: object) -> bool, object and Matrix ar compatible) while in the runtime because float's eq will return a NotImplemented, interpreter will try to run the __ne__ function of the RHS which returns a Matrix.

This will create situations where the inferred type of a simple equality check operation does not match the runtime's type, all because primitives stubs has been defined in a way that is not representative of their runtime behavior.

mmohaveri avatar Jun 30 '22 11:06 mmohaveri

This is indeed problematic and the natural solution would be an overloads returning NotImplemented. In the case of float (where FloatLike would have to replaced with something appropriate):

@overload
def __eq__(self, other: FloatLike) -> bool: ...
@overload
def __eq__(self, other: object) -> NotImplemented: ...

That said, we haven't returned NotImplemented in typeshed yet and I'm not sure that all type checkers even support this. It would be worth a try in an exploratory PR. Or maybe we should discuss it on typing-sig first.

srittau avatar Jun 30 '22 11:06 srittau

Or maybe we should discuss it on typing-sig first.

I would be opposed to making changes like this in typeshed without having a discussion on typing-sig first.

This would be a large change to make. My instinct is to say that we shouldn't be making allowances for the ways in which numpy/pandas (ab)use __eq__ and __ne__ methods, since the way these methods are written in numpy/pandas usually violates LSP. It's probably (well, almost certainly) too late to change that at this point, but I still don't feel like we should really be bending over backwards to accommodate it.

AlexWaygood avatar Jun 30 '22 11:06 AlexWaygood

This is not just a numpy problem:

class Floaty:
    def __init__(self, f: float) -> None:
        self._f = f

    def __eq__(self, other: object):
        if not isinstance(other, float):
            return NotImplemented
        return other == self._f
    

print(Floaty(3.0) == 3.0)  # True
print(3.0 == Floaty(3.0))  # True

This is due to how __eq__ etc. work: Docs. If the left side comparison returns NotImplemented, the right side comparison is used instead.

srittau avatar Jun 30 '22 12:06 srittau

This is due to how __eq__ etc. work: Docs. If the left side comparison returns NotImplemented, the right side comparison is used instead.

I know, but in general, this doesn't really matter that much, since in general, the only relevant thing to consider is whether the result of __eq__ is truthy or falsey, and in general, bool() can be called on literally any object in Python. As the docs that you linked to state:

By convention, False and True are returned for a successful comparison.

I think that it is useful to continue to enforce this convention in the stubs; in general, doing otherwise is surprising behaviour that I would consider to be a code smell. It is only with numpy/pandas where the precise return type of __eq__ or __ne__ matters, since it is only numpy/pandas that insist on violating this convention and (often) return objects from these methods that raise exceptions if you try to call bool() on them.

AlexWaygood avatar Jul 03 '22 17:07 AlexWaygood

@srittau

we haven't returned NotImplemented in typeshed yet and I'm not sure that all type checkers even support this. It would be worth a try in an exploratory PR

We dont' even need the def __eq__(self, other: object) -> NotImplemented: ... one, I know that mypy and pyright allow returning NotImplemented when the return type is something else. That's because that's an expected behavior for a function that has been passed a wrong input type.

@AlexWaygood

My instinct is to say that we shouldn't be making allowances for the ways in which numpy/pandas (ab)use eq and ne methods, since the way these methods are written in numpy/pandas usually violates LSP

I understand your point of view regarding LSP, but the problem you're describing does make writing scientific codes with python much much easier and human understandable, and as a programming language with a large scientist users, I think in these cases it's really ok, and even IMO encouraged to break the LSP.

since in general, the only relevant thing to consider is whether the result of eq is truthy or falsey

Again, that's because you're ignoring the scientific use of the language. Imagine you're creating a library that simplifies SIMD parallelization by defining an IF function that takes a list of bool and two data list and interlace the two list based on the boolean list. In this case it makes sense for the __eq__ operator to return something other than a boolean value.

Finally, My main problem is that the aforementioned type definitions are WRONG. A type definition SHOULD describe the correct ways a function can be used, and a scenario in which the function returns NotImplemented is by definition the wrong usage, and should not be included in the type hint.

mmohaveri avatar Jul 04 '22 10:07 mmohaveri

Again, that's because you're ignoring the scientific use of the language.

I understand that scientific users of the language represent a major portion of the Python community, and I don't want to disregard that :)

I remain of the opinion that this would be a very significant change to make, which would affect the Python-typing community at large, and that we should therefore not make any changes in this area without first discussing it on the typing-sig mailing list, so that we can get a broader range of feedback.

AlexWaygood avatar Jul 05 '22 18:07 AlexWaygood

I remain of the opinion that this would be a very significant change to make, which would affect the Python-typing community at large, and that we should therefore not make any changes in this area without first discussing it on the typing-sig mailing list, so that we can get a broader range of feedback.

Although I don't think it affects python-typing community at large (as it does not break any code that's working correctly and only allows some other correct uses of the operation to be included in the types), but I understand your point of view. I just don't know how to discuss this issue on the typing-sig mailing list. If that's something that I can do I'll be happy to write my opinion about it there and ask for feedback on it, just let me know where and how I can join and do that.

mmohaveri avatar Jul 06 '22 11:07 mmohaveri

https://mail.python.org/archives/list/[email protected]/

AlexWaygood avatar Jul 06 '22 11:07 AlexWaygood

I have created https://discuss.python.org/t/make-type-hints-for-eq-of-primitives-less-strict/34240 Feel free to continue discussing over there :)

headtr1ck avatar Sep 20 '23 19:09 headtr1ck

That said, we haven't returned NotImplemented in typeshed yet and I'm not sure that all type checkers even support this.

FWIW, I've been using -> NoReturn | <expected_subclass_return_type>: ...

Avasam avatar Oct 01 '23 21:10 Avasam

If you think about Python's type hints in terms of set theory, NoReturn is equivalent to the empty set -- the bottom type, a type with no members -- so a union with NoReturn is essentially always redundant. NoReturn | int will be internally simplified by the type checker to just int, for example.

AlexWaygood avatar Oct 01 '23 21:10 AlexWaygood

If you think about Python's type hints in terms of set theory, NoReturn is equivalent to the empty set -- the bottom type, a type with no members -- so a union with NoReturn is essentially always redundant. NoReturn | int will be internally simplified by the type checker to just int, for example.

I can see that after testing it quickly. Pylance language server follows that principle and it makes sense. Not sure why I was under the impression it worked differently. nvm my previous comment then.

I only found one place where I added a NoReturn union. I'll go fix that up. And I'll follow that typing-sig discussion as it seems -> NotImplemented | type: ... is what I expected

Avasam avatar Oct 01 '23 23:10 Avasam

I think all relevant points have been made. __eq__'s standard semantics are to return bool. If a library overloads the standard semantics, using # type: ignore seems reasonable. It's not necessary to add NotImplemented to the return types as this is special-cased by type checkers.

(Please reopen, if another maintainer disagrees.)

srittau avatar Nov 21 '23 11:11 srittau