pytypes icon indicating copy to clipboard operation
pytypes copied to clipboard

An issue with forward declarations and recursive type

Open mitar opened this issue 6 years ago • 25 comments

Example:

import typing

from pytypes import type_util

Container = typing.Union[
    typing.List['Data'],
]

Data = typing.Union[
    Container,
    str, bytes, bool, float, int, dict,
]

type_util._issubclass(typing.List[float], Container)
Traceback (most recent call last):
  File "<redacted>/lib/python3.6/site-packages/pytypes/type_util.py", line 1387, in _issubclass_2
    return issubclass(subclass, superclass)
TypeError: Forward references cannot be used with issubclass().

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "test.py", line 14, in <module>
    type_util._issubclass(typing.List[float], Container)
TypeError: Invalid type declaration: float, _ForwardRef('Data')

mitar avatar Dec 19 '17 01:12 mitar

I think currently forward declarations are only supported directly in annotations. Not sure if a mix of types and forward declarations is supported, but I suppose it's not (yet). Currently you can only use things like

def some_function(a: "a_type_defined_later"): ...

Putting the whole type declaration into a string might work, but I suppose forward declarations are currently parsed in get_type_hints or so. I'm not sure when I'll find time to implement this. PRs welcome...

Stewori avatar Dec 19 '17 01:12 Stewori

Quickly looked into the code. The relevant position is at https://github.com/Stewori/pytypes/blob/15ec80c27e6f933b85ebdf4a6885a5c62552a50f/pytypes/type_util.py#L836. There it detects a string eventually and would resolve forward declaration. We could add such a case to _issubclass but then it would resolve the forward declaration again and again for every new subtype check. Or _issubclass is allowed to modify the incoming type. Or we have some kind of cache.

A problem with resolving forward declarations within _issubclass is that _issubclass usually does not know the context, i.e. the module where the declaration is defined. It can maybe guess it from the caller; I suspect it is not possible to get the enclosing module from a type, especially not if it is one pure forward declaration string. If you have an idea how _issubclass should/could retrieve that info, let me know. Or maybe we should add yet another optional arg that names the module to use for resolving forward declarations.

Maybe the best idea would be to have a public service function (e.g. resolve_forward_decl) that resolves forward declarations in a type and also takes the module as explicit argument, returning the resolved type. Then _issubclass simply requires that a type containing forward declarations was previously processed by that function. An appropriate error msg would point the user to this.

Also note that resolving a forward ref, even checking if a type needs to resolve a forward ref somewhere internally is expensive and in most cases not necessary. With an explicit service function the user can actively and transparently resolve types as needed. What do you think?

Stewori avatar Dec 19 '17 02:12 Stewori

I think the public service function would be the best. It just has to make sure it resolves recursive types through reusing the same reference instead of copying the type over again and again. But yes, I think this would be the best and could also be done only once/cached on the caller's side.

mitar avatar Dec 19 '17 03:12 mitar

Any chance on implementing this soon or helping a bit how to do it? I thought I can make a workaround for us but it seems it does not work and it started blocking one validator we have for code and are using this package to check types. :-(

mitar avatar Dec 23 '17 23:12 mitar

Sorry, I was rather busy during christmas time. It would be great to have some help on this one. I already spent some thoughts about implementing the resolve function. I'll try to put a draft together we can iterate on...

Stewori avatar Dec 24 '17 04:12 Stewori

Yea, I completely understand about holidays. I tried to finish this before them, but it seems I got stuck now. Anyway, don't worry, I will try to see what I can do.

mitar avatar Dec 24 '17 05:12 mitar

So, I spontaneously committed some work on this as of https://github.com/Stewori/pytypes/commit/0f454aec2cc2ba92efa7dd3aed2215a2cc986394. This should help with your issue for now. However, it maybe does not yet handle callables correctly. And it might run into recursion issues eventually. The example from above works now, if you do it like this:

Container = Union[Sequence['Data'], int] # All this wouldn't work with List because of invariant arg type
Data = Union[Container, str, bytes, bool, float, int, dict]
pytypes.resolve_fw_decl(Container) # helpful exception test will show up if this line is left out
pytypes.is_subtype(typing.List[float], Container) # true
pytypes.is_subtype(typing.List[complex], Container) # false

Note that you need Sequence still because of the invariant parameter of List discussed in other thread.

Stewori avatar Dec 24 '17 07:12 Stewori

You are amazing! I will check it immediately!

mitar avatar Dec 24 '17 07:12 mitar

Hm, what is the second return value from resolve_fw_decl? Why is that necessary?

mitar avatar Dec 27 '17 00:12 mitar

Oh, I see, I do not even have to check return values. Cool.

mitar avatar Dec 27 '17 01:12 mitar

Works like a charm! Thanks.

mitar avatar Dec 27 '17 01:12 mitar

Care to release a version with this improvement?

mitar avatar Dec 27 '17 10:12 mitar

The first return value is the type itself. In case the input was one pure forward declaration as a string this is relevant. The second return value is a boolean indicating if a change was made, i.e. it is true if at least one actual forward declaration was found and resolved. This is to help people assess if they should eventually repeat or update operations or hashes or whatever they might have already done with their type. It's also useful for testing.

There are actually some remaining todos on this issue (so reopening it for now...):

  • support callables in resolve_fw_decl (done as of https://github.com/Stewori/pytypes/commit/3a65f6ddff7e6009d840ef70706a7f84349565be)
  • use resolve_fw_decl in https://github.com/Stewori/pytypes/blob/15ec80c27e6f933b85ebdf4a6885a5c62552a50f/pytypes/type_util.py#L836 (done as of https://github.com/Stewori/pytypes/commit/3846df7c364eb21ca157b25de021bad5e0ebd5ce)
  • write doc for resolve_fw_decl
  • write test for resolve_fw_decl
  • write test for is_subtype concerning typing._ForwardRef
  • add support for typing._ForwardRef in type_util.type_str (done as of https://github.com/Stewori/pytypes/commit/3fb6266b5fbdd6913f92357c36ec478f02812644)
  • do some reasoning and checking regarding potential endless recursion in some typechecking scenarios (done)
  • write test for endless recursion case
  • support typing._ForwrdRef in stubfile_2_converter

Stewori avatar Dec 27 '17 12:12 Stewori

Care to release a version with this improvement?

I want to have a new release soon, but would like to get some more stuff in, e.g. at least some of the todos listed above, fixes/resolutions for https://github.com/Stewori/pytypes/issues/23, https://github.com/Stewori/pytypes/issues/18, https://github.com/Stewori/pytypes/issues/20, eventually https://github.com/Stewori/pytypes/issues/21, failures mentioned in https://github.com/python-attrs/attrs/issues/301#issuecomment-347552114 Help is welcome!

Stewori avatar Dec 27 '17 12:12 Stewori

Hm, it seems there is no protection against infinite recursion. I get stack overflow with this test:

import typing

from pytypes import type_util

Data = typing.Union['Container', float]
Container = typing.Union[Data, int]

type_util.resolve_fw_decl(Data)
type_util.resolve_fw_decl(Container)

type_util._issubclass(list, Container, bound_typevars={})

Of course it is obvious that definition of types is buggy, but maybe resolve_fw_decl could validate that?

mitar avatar Dec 29 '17 08:12 mitar

maybe resolve_fw_decl could validate that?

Is it possible to have a recursion issue with a type not involving Union? I'm spontaneously not entirely sure.

If so, I think it would be better to make _is_subclass_Union recursion proof in general. I'm just wondering what would be the best way to do it (e.g. caching already checked types; maybe allow it to run at first a special _issubclass mode that does not follow forward references to have the succeed-fast potential).

Otherwise we would have to make _issubclass itself recursion proof (doable as well, but more costly regarding caching).

It would be good to have an eventual example for recursion issue not involving Union. For better understanding and later for testing.

Stewori avatar Jan 01 '18 17:01 Stewori

Is it possible to have a recursion issue with a type not involving Union? I'm spontaneously not entirely sure.

I think Union + forward declarations.

Or Union types without forward declarations if somebody manually "patch" Union objects in some other way.

If so, I think it would be better to make _is_subclass_Union recursion proof in general.

Are you sure? Keeping track might introduce additional performance cost. But from design perspective this is probably cleaner, yes.

I'm just wondering what would be the best way to do it

I would just have an extra argument to it, similar to bound variables, stored types being in process of resolving.

mitar avatar Jan 02 '18 21:01 mitar

I think Union + forward declarations.

I mean are there examples without Union, i.e. is this issue specific to Union checking or do we have to think more general?

Are you sure? Keeping track might introduce additional performance cost. But from design perspective this is probably cleaner, yes.

My philosophy in pytypes is to allow user to opt out of such arguable stuff. Aside that I value correctness higher than performance (in production, one would disable typechecking anyway I suppose). Also, that's why I think of a fast-succeed mode.

Stewori avatar Jan 02 '18 21:01 Stewori

I mean are there examples without Union, i.e. is this issue specific to Union checking or do we have to think more general?

I cannot imagine a case without Union. This is the only current type which allows multiple options, no?

mitar avatar Jan 02 '18 21:01 mitar

Any solution to this would require at least one more new arg for _issubclass and thus the whole type checking function family. I wonder if it would be better to wrap all these args into kind of class like type_memo in typeguard.

I cannot imagine a case without Union

Me neither. I hope we don't overlook something.

Stewori avatar Jan 02 '18 22:01 Stewori

As of https://github.com/Stewori/pytypes/commit/74ef4b9babfe3b1d70b0d547e8e8eea575dcdd1b _issubclass performs recursion-checks if a _ForwardRef is encountered. As far as I tested, this solves the recursion issue you observe. Would be good if you could confirm this. Final todo for this issue is to add tests. Help welcome...

Stewori avatar Jan 16 '18 21:01 Stewori

I have not encountered this in practice but only during testings of new code. So the example above is the only one I have.

Hm, but _issubclass does not see anymore _ForwardRefs if they were resolved? Or am I misunderstanding your comment? Or resolving does not get rid of _ForwardRefs?

New code seems to work through against my example above.

mitar avatar Jan 16 '18 22:01 mitar

No, it actually does not get rid of _ForwardRef. After looking more closely into _ForwardRef I had learned that it provides fields to store the referenced type, i.e. __forward_arg__, __forward_value__ and __forward_evaluated__. On resolving, __forward_value__ shall be filled with the actual type, __forward_arg__ provides the string representation thereof. Using these fields seems to be the official way and avoids sort of hassle with replacing a type within a potentially complex structure of PEP484 types. The downside is, it allows for cycles, but that's not my invention. _issubclass should now be cycle proof regarding this. And I think without much additional overhead.

Stewori avatar Jan 17 '18 00:01 Stewori

Awesome!

mitar avatar Jan 17 '18 00:01 mitar

As of 74ef4b9 _issubclass performs recursion-checks if a _ForwardRef is encountered. As far as I tested, this solves the recursion issue you observe.

So it currently returns False in such case, no? Wouldn't raising an exception be better? Because this is a faulty type to begin with.

mitar avatar Apr 05 '18 16:04 mitar