typing icon indicating copy to clipboard operation
typing copied to clipboard

Introduce an Unknown type

Open Eyal-Shalev opened this issue 1 year ago • 3 comments
trafficstars

Unknown Definition

A type that can only be cast to Any

Problem

Currently when type checkers fail to understand the type of a function/variable, they fallback on Any.

T = TypeVar("T")

def foo(fn: Callable[[], T]) -> T:
  return fn()

bar = foo(lambda: 5)

# The type of bar will fallback on `Any` so the type checker will not warn on misuse
_ = bar[0]

Suggestion

If an Unknown type is introduced, and can be defined as a fallback in the checker then all operations performed on it (that assume a specific type) will fail.

T = TypeVar("T")

def foo(fn: Callable[[], T]) -> T:
  return fn()

bar = foo(lambda: 5)

# The checker will mark this as an error because subscripting is not defined on 
`Unknown`.
_ = bar[0]

Inspirations

Typescript: https://www.typescriptlang.org/docs/handbook/type-compatibility.html#any-unknown-object-void-undefined-null-and-never-assignability

any and unknown are the same in terms of what is assignable to them, different in that unknown is not assignable to anything except any.

unknown and never are like inverses of each other. Everything is assignable to unknown, never is assignable to everything. Nothing is assignable to never, unknown is not assignable to anything (except any).

Eyal-Shalev avatar Aug 08 '24 06:08 Eyal-Shalev

There already is an "unknown" type in Python: object. JavaScript - and by extension Typescript - doesn't have a universal base type by virtue of having scalar types, making a synthetic unknown type necessary.

That said, I think some type checkers already have an internal "unknown" type although that has a slightly different purpose.

srittau avatar Aug 08 '24 07:08 srittau

As @srittau said, the object type serves a similar purpose in Python's type system.

Pyright implements an Unknown type to distinguish between an explicit Any (one that comes from an Any type expression) and an implicit Any (one that is generated by a type checker when a type is unspecified, a TypeVar cannot be solved, an import cannot be resolved, etc.). For details, refer to the pyright documentation.

Pyright's concept of Unknown differs from TypeScript's unknown in that: 1) Unknown cannot be used in a type expression because it is always implicit and 2) assignability rules for Unknown are the same as Any.

Here's a sample in pyright playground

erictraut avatar Aug 08 '24 07:08 erictraut

As @srittau said, the object type serves a similar purpose in Python's type system.

Pyright implements an Unknown type to distinguish between an explicit Any (one that comes from an Any type expression) and an implicit Any (one that is generated by a type checker when a type is unspecified, a TypeVar cannot be solved, an import cannot be resolved, etc.). For details, refer to the pyright documentation.

Pyright's concept of Unknown differs from TypeScript's unknown in that: 1) Unknown cannot be used in a type expression because it is always implicit and 2) assignability rules for Unknown are the same as Any.

Here's a sample in pyright playground

  1. Thank you. I never checked Pyright, so wasn't aware it is so much better than MyPy 🙇 - I'm going to try and move my team from MyPy to Pyright.
  2. I think that having the unknown type as a PEP (and thus in the reference implementation i.e. MyPy) is very valuable.

Eyal-Shalev avatar Aug 08 '24 23:08 Eyal-Shalev

I don't find the use of "object" logical and it can be disturbing for the developers. The types parsers use the word "Unknown" already when they don't know the type of a variable, and not "object". Creating the type "Unknown" is much more appropriate in every cases.

vtgn avatar Oct 17 '24 11:10 vtgn

I don't think there's much value in creating a second "unknown" type that's virtually equivalent to object. As pointed out by erictraut, the unknown type used internally by pyright has different semantics than what's proposed here.

srittau avatar Oct 17 '24 12:10 srittau

I don't think there's much value in creating a second "unknown" type that's virtually equivalent to object. As pointed out by erictraut, the unknown type used internally by pyright has different semantics than what's proposed here.

object is absolutely not equivalent to Unknown, because object type declares several properties and methods that don't exist on values like None, int, str. So if you have:

o: object = 3
print(o.__dict__) # => no problem for static typing, it exists for object type

but at the execution : AttributeError: 'int' object has no attribute 'dict'. Did you mean: 'dir'?

It's totally wrong to say that object is an equivalent of Unknown, because it is clearly NOT as proved above! An Unknown type is absolutely necessary to force the developer to check the contents of the value before to call a property/method on it.

vtgn avatar Oct 17 '24 16:10 vtgn

>>> isinstance(3, object)
True
>>> isinstance(None, object)
True

That not all types have a __dict__ attribute is true, and unfortunately not representable using the type system. But this is unrelated to None, int and other builtins to being "special". The same is true for other types implemented in extension modules:

>>> import re
>>> re.compile("").__dict__
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 're.Pattern' object has no attribute '__dict__'. Did you mean: '__dir__'?

srittau avatar Oct 17 '24 17:10 srittau

>>> isinstance(3, object)
True
>>> isinstance(None, object)
True

That not all types have a __dict__ attribute is true, and unfortunately not representable using the type system. But this is unrelated to None, int and other builtins to being "special". The same is true for other types implemented in extension modules:

>>> import re
>>> re.compile("").__dict__
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 're.Pattern' object has no attribute '__dict__'. Did you mean: '__dir__'?

OMG!!! For me, this is clearly a huge inconsistency of the Python langage, violating the most basic rules of static typing. >___<° !!!! I understand it should have been hard to add static typing in a language who hadn't, but that choice was a huge mistake. What a shame!

vtgn avatar Oct 17 '24 17:10 vtgn

One thing we could try is adding __dict__: None annotations to such classes, similar to what we do with unhashable classes (where we use __hash__: ClassVar[None]).

srittau avatar Oct 17 '24 17:10 srittau

@srittau I understand it is hard to fix this kind of type's inconsistency because of historical features of Python. :(

The object class is a parent class of every objects, but its declared members are not inherited by all the sub-classes. :/ In practical terms, this does not completely violate the rules of inheritance, because it amounts to declaring a member in a parent class, which is redefined as throwing an exception in a child class. It is not clean, but it is correct.

Even if we don't use the object type in our code, the fact that all classes inherit from it indicates automatically that its members are available for all classes, so it doesn't fix the problem.

To fix this problem without developing a Python 4 version implementing a proper typing overhaul, the static typing tools should ignore the object class declared members. I don't know if there are other builtins classes having such a problem of declared members not inherited by sub-classes, but the tools should take into account all of them to only keep the existing ones.

I see no other simple solution... :/

vtgn avatar Oct 17 '24 18:10 vtgn

That not all types have a __dict__ attribute is true, and unfortunately not representable using the type system

Not currently but why would introducing an Unknown type not solve this?

septatrix avatar Nov 21 '24 13:11 septatrix

That not all types have a __dict__ attribute is true, and unfortunately not representable using the type system

Not currently but why would introducing an Unknown type not solve this?

I agree. All objects inherits from object, ok, but why not creating this special Unknown type indicating to the linters that nothing is declared for the objects typed by this type? Until we do isinstance(unknown_object, object), unknown_object will be seen with no declared members.

The other solution would be to implement a new typing module for wrapping all the types of the standard library to show only the real existing members.

vtgn avatar Nov 21 '24 13:11 vtgn

why would introducing an Unknown type not solve this?

It wouldn't solve this because it's not coherent in any type system sense.

A Python type represents a set of possible runtime Python objects. object represents the set of all possible Python objects that could ever exist. isinstance(x, object) is always true. Every possible Python object is already part of the type object. So what set of runtime Python objects could the type unknown represent, other than being exactly equivalent to object? What, precisely, would the type unknown include that is not included in the type object?

And it also wouldn't solve the problem, because if you did isinstance(x, object) and then had something typed as object, typeshed would still claim it has the __dict__ attribute, and that still would be wrong for some objects. So adding a new type would do nothing to solve the actual problem with object.__dict__.

The problem with __dict__ is a typeshed problem. Typeshed chooses to claim that the object type has the attribute __dict__, because most Python objects have it, even though it is not true that all Python objects have it. It would be very easy to change that in typeshed, but probably it would mean a ton of new false positives on code that accesses __dict__ on instances that are known to definitely have it (which is all instances of all user-defined Python classes that don't use __slots__.) So it's not clear that would be better than the status quo.

Accurately reflecting the existence or non-existence of the __dict__ attribute would be a much more complex feature, that would require type checkers to understand a number of esoteric runtime rules. It's certainly not impossible to add it, but adding a new name for the object type wouldn't help.

carljm avatar Nov 21 '24 20:11 carljm

why would introducing an Unknown type not solve this?

It wouldn't solve this because it's not coherent in any type system sense.

Counter example: Typescript has unknown type so it is coherent in at least 1 type system.

The Unknown type has no value at runtime. It is only used by the type checkers to alert developers that try to perform type specific actions before checking what type the value has.

from untyped.library import value

def inc(num: int) -> int:
  return num+1

print(inc(value))  # The type system should warn about this use.

In the above example MyPy considers value to be out type Any which means it won't warn about a misuse. This is what my proposal tries to solve.

note that there are other ways to make the type system default to Any like using lambdas.

Eyal-Shalev avatar Nov 22 '24 07:11 Eyal-Shalev

This is what my proposal tries to solve.

If you say that the object type solves this issue, then I say that type systems like MyPy should default to object instead of Any.

Eyal-Shalev avatar Nov 22 '24 07:11 Eyal-Shalev

If you say that the object type solves this issue,

Yes, the object type is the correct type to use for "some arbitrary Python object about which nothing more specific is known."

then I say that type systems like MyPy should default to object instead of Any.

That's a reasonable point of view; there are tradeoffs and different users in different scenarios will have different preferences. It's the sort of thing that can potentially be a strictness option. One note is that if the missing import itself is considered to be a type error, then also defaulting it to object can cause a lot of cascading type errors, obscuring the root issue that should be fixed. Falling back to object makes more sense if the missing import has been "accepted" somehow (e.g. in type checker configuration) and is not itself considered an error.

But since the behavior of missing imports is not specified, discussion of this behavior belongs in the issue tracker of individual type checkers.

carljm avatar Nov 22 '24 16:11 carljm

@carljm

I already said above that the huge problem of using the object type is that it declares members that don't exist for all its subclasses. Python is not a consistent typing language, it would be good if new types could fix these problems, and using object for typing is clearly one of them, because it violates the polymorphism rules (all the subclasses/subtypes must inherit all the members of their parent classes/types). To fix this problem, creating an Unknown type like for Typescript is a perfect solution. I don't understand what is the problem with that, it's just a new type declaring nothing and accepting any instance. Plus it will help the static typing linters, by having an existing type they can use when they have no info about the type of an object. Using the Any type by default in this case is a huge mistake, the worst mistake! The official Python extension in vscode displays already "Unknown" when it can't type an object, it's just ridiculous that this is not an existing type: image

vtgn avatar Dec 12 '24 00:12 vtgn

I don't think there's much value in creating a second "unknown" type that's virtually equivalent to object.

We is requested here is explicitly not an Unknown type that is equivalent to object. As you pointed out object implies the existence of some attributes that will be hard to remove as it might surface many new errors in existing projects¹. What is proposed here is a new type that does away with these attributes which are implicitly assumed to exist.


¹You call these false positives but at the same time state that there exist objects which do not have e.g. __dict__, namely those using slots. Therefore these are not false positives but instead cases where the existing preconditions in the code would need to be improved.

Alternatively, we could add new protocols for all these implicit object attributes and mypy could introduce a new flag (e.g. strict_object) which, when enabled, disables any implicit attributes on objects for improved correctness. These protocols should then be used where currently object is used, and object could become the proper Unknown type it should have been all along.

septatrix avatar Dec 16 '24 00:12 septatrix

It's worth noting that __dict__ is not actually an attribute of object at all at runtime, i.e. hasattr(object(), '__dict__') is False. I think the fact that __dict__ is defined in the static type definition for object is really a hack in order to make the type checker assume any object may have a __dict__. In other words it was presumably a deliberate choice for "unknown" objects to be treated as having a __dict__ even though they may not (similar to how arbitrary objects are assumed to be hashable although they may not be).

rmccampbell avatar Oct 24 '25 03:10 rmccampbell

It's worth noting that __dict__ is not actually an attribute of object at all at runtime, i.e. hasattr(object(), '__dict__') is False. I think the fact that __dict__ is defined in the static type definition for object is really a hack in order to make the type checker assume any object may have a __dict__. In other words it was presumably a deliberate choice for "unknown" objects to be treated as having a __dict__ even though they may not (similar to how arbitrary objects are assumed to be hashable although they may not be).

OMG you're right!! The problem is not Python but the linter who proposes non existent features on a direct object instance, like __dict__ and several others. @_@ !!!

a = object()
print(dir(a))

['__class__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__',
 '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', 
'__le__', '__lt__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', 
'__setattr__', '__sizeof__', '__str__', '__subclasshook__']

vtgn avatar Oct 24 '25 08:10 vtgn