typing icon indicating copy to clipboard operation
typing copied to clipboard

AnyOf - Union for return types

Open srittau opened this issue 7 years ago • 28 comments

Sometimes a function or method can return one of several types, depending on the passed in arguments or external factors. Best example is probably open(), but there are other examples in the standard library, like shutil.copy()[1]. In many of those cases, the caller knows what return type to expect.

Currently there are several options, but none of them is really satisfactory:

  • Use @overload. This is the best solution if it can be used. But that is often not the case, like in the examples above.
  • Use a mypy plugin. This solution does not scale to outside the standard library (and arguably not even inside it) and is mypy-specific.
  • Use Union as the return type. This is usually not recommended, since it means that the caller needs to use isinstance() to use the return type.
  • Use Any as the return type. This is currently best practice in those cases, but of course provides no type safety at all.

Therefore, I propose to add another type, for example AnyOf[...] that acts like Union, but can be used everywhere any of its type arguments could be used.

from datetime import date
x: AnyOf[str, date] = ...
s: str = x  # ok
dt: date = x  # ok
i: int = x  # type error
u: Union[str, bytes] = x  # ok
x = u  # type error (although the type checker could do something smart here and infer that u can only be str here)

I also think that the documentation should make it clear that using AnyOf is a code smell.

[1] Currently the type behaviour in shutil is broken in my opinion, but that does not change the fact that currently it is as it is.

srittau avatar Jun 21 '18 11:06 srittau

Another use case - at least util we get a more flexible Callable syntax - is for optional arguments in callback functions, like WSGI's start_response():

StartResponse = AnyOf[
    Callable[[str, List[(str, str)]], None],
    Callable[[str, List[(str, str)], ExcInfo], None]
]

def create_start_response() -> StartResponse:
    ...

create_start_response()("200 OK", [])

Using Union this causes a type error. (Too few arguments.)

srittau avatar Jun 21 '18 11:06 srittau

This would be as unsafe as Any though right? E.g. def foo() -> Union[int, str] -- we have no idea whether foo() + 1 is safe or not. Sure, it can tell you that foo().append(1) is definitely wrong, but that's pretty minor.

Similarly we won't know if create_start_response()("200 OK", [], sys.exc_info()) will be accepted or not. If you meant to describe a situation where it returns a callback that can be called with or without the extra argument, there's already a solution: https://mypy.readthedocs.io/en/latest/additional_features.html?highlight=mypy_extensions#extended-callable-types.

gvanrossum avatar Jun 21 '18 21:06 gvanrossum

Good to know that there is a proper solution for the callback case!

Personally, I think the improvement in type safety over just returning Any would be worth it. It surely can't catch all problematic cases, but some is better than none. And the issue seems to crop up from time to time, for example just after I opened this in python/typeshed#2271. That the problem of returning Unions is also explicitly mentioned in the docs is also noteworthy, I think.

srittau avatar Jun 21 '18 22:06 srittau

In my experience I never had a situation where I needed unsafe unions. Anyway, I could imagine some people might want it. However, the problem is that the benefits of unsafe unions are incomparable with the amount of work to implement them. Adding a new kind of types to mypy e.g. could take months of intense work. This one will be as hard as introducing intersection types, and while the later are more powerful (they cover most of use cases of unsafe unions) we still hesitate to start working on it.

ilevkivskyi avatar Jun 22 '18 09:06 ilevkivskyi

I'm with Ivan, and this is something we've considered earlier -- see the discussion at https://github.com/python/mypy/issues/1693, for example. The relatively minor benefits don't really seem worth the extra implementation work and complexity.

This would only be a potentially good idea for legacy APIs that can't be typed precisely right now, and the most popular of those can be special cased by tools (e.g. through mypy plugins). Mypy already special cases open and a few other stdlib functions. Alternatively, we might be able to use some other type system improvements, such as literal types, once they become available. For new code and new APIs the recommended way is to avoid signatures that would require the use of AnyOf anyway.

Ad-hoc extensions have the benefit of being easy to implement. They are also modular, don't complicate the rest of the type system, and they potentially allow inferring precise return types.

There is also often a simple workaround -- write a wrapper function around the library function with an Any return that has a precise return type, by restricting the arguments so that the return type can be known statically. Example:

def open_text(path: str, mode: str) -> TextIO:
    assert 'b' not in mode
    return open(path, mode)

def open_binary(path: str, mode: str) -> BinaryIO:
    assert 'b' not in mode
    return open(path, mode + 'b')

JukkaL avatar Jun 22 '18 10:06 JukkaL

Some use cases (such as open) can now be supported pretty well by using overloads and literal types (PEP 586).

JukkaL avatar Jul 17 '19 11:07 JukkaL

This continues to crop up with typeshed pretty regularly. For a lot of return types, typeshed either has to either make a pretty opinionated decision or completely forgo type safety with Any.

One use case I find pretty compelling is for autocomplete in IDEs, eg: https://github.com/python/typeshed/issues/4511 I seem to recall PyCharm used unsafe unions and I'd imagine this is a big reason why.

From a typeshed perspective, it would be nice to support these use cases. From a mypy perspective, I agree it's maybe not worth the effort, so maybe type checkers could interpret a potential AnyOf as a plain Any.

hauntsaninja avatar Sep 20 '20 06:09 hauntsaninja

Maybe you could bring this up on typing-sig? A proto-pep might get support there.

Or maybe you can spell this using -> Annotated[Any, T1, T2] where T1, T2 are the possible return types? Then type checkers will treat it as Any but other tools could in theory interpret this as AnyOf[T1, T2]. Or is that too hacky?

gvanrossum avatar Sep 20 '20 16:09 gvanrossum

I brought this up on typing-sig.

srittau avatar Sep 21 '20 13:09 srittau

Semantically, would AnyOf be equivalent to Intersection as proposed in #213? My intuition is yes: an operation on an AnyOf type should be valid if it is valid on at least one of the component types.

JelleZijlstra avatar Aug 11 '21 19:08 JelleZijlstra

Guido's thoughts on the subject: https://mail.python.org/archives/list/[email protected]/message/TTPVTIKZ6BFVWZBUYR2FN2SPGB63Z7PH/ ~~edited out misleading tldr~~

There's probably also some slightly different behaviour when intersecting slightly incompatible types. E.g., for an intersection maybe you'd want to treat intersection order like an MRO, but for AnyOf you'd probably want "is compatible with any of the intersection"

hauntsaninja avatar Aug 11 '21 20:08 hauntsaninja

I see, thanks for reminding me of that email! I suppose this matters when you're implementing a function with an AnyOf return type. In typeshed we could just write:

def some_func() -> Intersection[bytes, str]: ...

And it would work as expected.

But when implementing it, you'd write:

def some_func() -> Intersection[bytes, str]:
    if something:
        return bytes()
    else:
        return str()

And a type checker would flag the return type as incompatible. So in Guido's terminology, AnyOf would have to behave like Union in a "receiving" position and like Intersection in a "giving" position.

JelleZijlstra avatar Aug 11 '21 20:08 JelleZijlstra

Maybe I'm misunderstanding how intersections are supposed to work, but to me an intersection type is a type that fulfills the protocol of multiple other types. At least that's how e.g. typescript and the Ivan's initial comment in #213 describe it. Intersection[bytes, str] wouldn't make much sense to me, because it would mean that the returned type is both a str and a bytes. An intersection lets you "compose" multiple types into one, which is why I like the Foo & Bar syntax for it (also like typescript and in comparison to | for union).

srittau avatar Aug 11 '21 21:08 srittau

And that means that AnyOf has not much relation to intersections. Like Union, it's more meant to be an "either/or" situation. For example, the following would work with AnyOf, but not with Union (which why AnyOf is unsafe, but still much safer than Any):

def foo(b: bytes): ...

x: AnyOf[str, bytes]
y: str | bytes
foo(x)  # ok
foo(y)  # error

srittau avatar Aug 11 '21 21:08 srittau

And sorry for the spam, but one last thought:

For the caller of a function, there is no difference, whether an argument is annotated with AnyOf of Union. In fact, I can't think of a reason why an argument should ever be annotated with it. It's mostly a tool for return types.

srittau avatar Aug 11 '21 21:08 srittau

For the caller of a function, there is no difference, whether an argument is annotated with AnyOf of Union. In fact, I can't think of a reason why an argument should ever be annotated with it. It's mostly a tool for return types.

It would be for the benefit of the callee. Conversely, for the callee there's no reason to return an AnyOf, since for them a Union works as well.

Taking your example, the connection between AnyOf and Intersection is that if we had

def foo(b: bytes): ...

x: str & bytes
foo(x)

would work as well. But presumably, to give x a value, you'd want to work either of these:

x: AnyOf[str, bytes]
x = ""  # ok
x = b""  # also ok

And there it behaves like Union. Combining this, we can have:

def foo(b: bytes): ...
x: AnyOf[str, bytes]

# This works:
x = b""
foo(x)

# This works too (i.e. doesn't cause a static type error):
x = ""
foo(x)

At this point I would just start repeating what I said in that email, so I'll stop here.

gvanrossum avatar Aug 11 '21 23:08 gvanrossum

So to recap. For argument types:

# For the callee (receiver), AnyOf and Intersection are equivalent:
def foo1(x: AnyOf[str, StringIO]):
    do_str_stuff(x.getvalue() if hasattr(x, "getvalue") else x)
def foo2(x: str & StringIO):
    do_str_stuff(x.getvalue() if hasattr(x, "getvalue") else x)
# But Union isn't:
def foo3(x: str | StringIO):
    do_str_stuff(x.getvalue() if hasattr(x, "getvalue") else x)  # error

# But for the caller (giver) AnyOf and Union are equivalent:
foo1("")  # AnyOf ok
foo2("")  # Intersection error
foo3("")  # Union ok

For return types:

# For the callee (giver), AnyOf and Union are equivalent:
def foo1() -> AnyOf[str, bytes]:
    return ""  # ok
def foo2() -> str & bytes:
    return ""  # error
def foo3() -> str | bytes:
    return ""  # ok

# For the caller (receiver), AnyOf and Intersection are equivalent:
x1: str = foo1("")  # AnyOf ok
x2: str = foo2("")  # Intersection ok
x3: str = foo3("")  # Union error

Which means that in stubs (where there's no callee), Union and Intersection are sufficient, but AnyOf would still be needed for implementations. (Jelle's point, I think.)

srittau avatar Aug 12 '21 07:08 srittau

I don't understand why def foo() -> str & bytes should be any different from def foo() -> NoReturn. For -> str & bytes, a type checker could deduce that because an object can't be string and bytes at the same time, the function cannot return any value.

Akuli avatar Aug 12 '21 10:08 Akuli

I'd agree, though in pyright we explicitly have Never for that type (as I think NoReturn has different semantics, but maybe not). Comparing TS intersections with non-overlapping types:

image

(Overall, I agree that given the way it's been described, AnyOf is not an intersection type; more of a workaround for people not liking to return Union because existing code is too trusting of functions with behavior that's hard to capture with overloads, and never actually verifies that they got the thing they wanted.)

jakebailey avatar Aug 12 '21 20:08 jakebailey

Two more cases wherein AnyOf would be very useful:

Dealing with overload ambiguity

@overload
def func(a: Sequence[int]) -> str: ...
@overload
def func(a: Sequence[str]) -> int: ...

This recently came up in https://github.com/python/mypy/issues/11347: whenever ambigous overloads are encountered, e.g. when Sequence[Any] is passed in the example function above, mypy will generally return Any as it cannot safely pick either one of the overloads. With AnyOf this could be replaced with, e.g., AnyOf[str, int], which would provide quite a bit more type safety compared to plain Any.

Reduction of numpy arrays

The second case is more related to a pecularity of numpy, as operations involving numpy will rarelly return 0D arrays, aggressively converting the latter into their corresponding scalar type. This becomes problematic when reductions are involved, especially ones over multiple axes as this requires detailed knowledge of the original array-like objects' dimensionality.

While the variadics of PEP 646 (and any follow-ups) should alleviate this issue somewhat, there will, realistically, remain a sizable subset of array-like objects and axes (SupportsIndex | Sequence[SupportsIndex]) combinations wherein the best we can currently do is return Any. Replacing this with, for example, AnyOf[float64, NDArray[float64]] would be a massive improvement, especially since the signatures of numpys' scalar- and array-types have a pretty large overlap.

BvB93 avatar Oct 25 '21 15:10 BvB93

How would AnyOf[SomeClass, Any/Unknow] be treated?

I ask because of complex cases like this: https://github.com/python/typeshed/pull/9461 Where we could do the following without having to rely on installing all 4 libraries.

from PyQt6.QtGui import QImage as PyQt6_QImage  # type: ignore[import]
from PyQt5.QtGui import QImage as PyQt5_QImage  # type: ignore[import]
from PySide6.QtGui import QImage as PySide6_QImage  # type: ignore[import]
from PySide2.QtGui import QImage as PySide2_QImage  # type: ignore[import]

def foo() -> AnyOf[PyQt6_QImage, PyQt5_QImage, PySide6_QImage, PySide2_QImage]: ...

Or in a non-stub file with type inference:

try:
  from PyQt6.QtGui import QImage  # type: ignore[import]
except:
  pass
try:
  from PyQt5.QtGui import QImage  # type: ignore[import]
except:
  pass
try:
  from PySide6.QtGUI import QImage  # type: ignore[import]
except:
  pass
try:
  from PySide2.QtGUI import QImage  # type: ignore[import]
except:
  pass

# If inference is not feasible. An explicit AnyOf return type like a bove would do.
def foo():
    return QImage()

Avasam avatar Jan 07 '23 02:01 Avasam

Thoughts for a different approach: If AnyOf is really only useful to be permissive on return types (and avoid having to do a bunch of manual type-narrowing). Then could could type-checkers simply have an option to treat unions in return types as permissive unions?

This way you can keep annotating the types accurately. No need for a new user-facing type to juggle with. And let the users choose whether they want total strictness or be more permissive.

Pytype tends to err on the permissive side. Mypy can probably already be done with a plugin.

Avasam avatar Jan 19 '23 13:01 Avasam

Sometimes you want to return the usual union: if a function returns str | None and you forget to check for None, that should be an error.

Akuli avatar Jan 21 '23 12:01 Akuli