attrs icon indicating copy to clipboard operation
attrs copied to clipboard

Support for automatic runtime type-checking

Open anthrotype opened this issue 8 years ago • 17 comments

This has been proposed and discussed in https://github.com/python-attrs/attrs/issues/215, as a possible use case for the newly added type argument to attr.ib() #239

quoting @hynek https://github.com/python-attrs/attrs/issues/215#issuecomment-347529479

a good first step would be to add generic class-wide validators (@attr.s(validator=whatever)) and then make type checking a special case of it, possibly with some syntactic sugar.

anthrotype avatar Nov 28 '17 14:11 anthrotype

Ok, as a continuation of previous discussion in #215... :)

@glyph I was trying to find https://github.com/aldanor/typo today but instead I ran across https://github.com/RussBaz/enforce which looks like it may be a fairly full-featured version of the same thing.

@chadrik There is also https://github.com/Stewori/pytypes May the best project win.

Just to clarify, the sole reason I've started one of my own was that all existing solutions (those including enforce and pytypes) were (a) slow and (b) wrong -- although both are very good attempts and good inspiration. That being said, my version is not fully type-correct either when it comes to sum types (see examples below), but 'less wrong' if I may; on the bright side, it's fast. I haven't spent any time on finishing it due to lack of motivation and time lately, but it could be done, maybe with some help. If anyone knows any other comparable or relevant projects - shout away, I'd personally be very interested.

TL;DR: it's hard to write a runtime type checker that's both fast and correct, especially if it aims to handle both typevars and sum types; although not impossible (I'm not sure one already exists at this moment however). Details below.


# pip install git+https://github.com/RussBaz/enforce.git
import enforce
# pip install git+https://github.com/aldanor/typo.git  (3.5 only; needs a few fixes)
import typo
# pip install git+https://github.com/Stewori/pytypes.git
import pytypes  
# pip install # git+https://github.com/agronholm/typeguard.git
import typeguard

Simple example:

def simple(x: int, y: str): pass

simple_pytypes = pytypes.typechecked(simple)
simple_enforce = enforce.runtime_validation(simple)
simple_typeguard = typeguard.typechecked(simple)
simple_typo = typo.type_check(simple)

args = 1, 'foo'
%timeit -r7 -n1000 simple_pytypes(*args)
%timeit -r7 -n1000 simple_enforce(*args)
%timeit -r7 -n1000 simple_typeguard(*args)
%timeit -r7 -n1000 simple_typo(*args)
314 µs ± 23 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
155 µs ± 2.14 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
51.6 µs ± 13.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
497 ns ± 3.13 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)

All three work correctly; typeguard is 100x slower, enforce 300x, pytypes 600x.


Slightly more involved:

from typing import List, Union, Dict, Tuple

def nested(x: Dict[Tuple[int, bytes], List[Union[str, float]]]) -> int: return 1

nested_pytypes = pytypes.typechecked(nested)
nested_enforce = enforce.runtime_validation(nested)
nested_typeguard = typeguard.typechecked(nested)
nested_typo = typo.type_check(nested)

x = {(1, b'3'): ['a', 1., 'b', 3.], (3, b'1'): [], (4, b'5'): ['c', 3.14]}
# %timeit -r7 -n1000 nested_pytypes(x)  # FAILS
%timeit -r7 -n1000 nested_enforce(x)
%timeit -r7 -n1000 nested_typeguard(x)
%timeit -r7 -n1000 nested_typo(x)
2.08 ms ± 23.1 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
176 µs ± 1.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
4.57 µs ± 169 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)

Three out of four work -- pytypes fails, typeguard is 40x slower, enforce 450x.


Simple generic example (with a catch):

from typing import TypeVar

A, B = TypeVar('A'), TypeVar('B')

def generic1(x: List[Union[A, int]], y: A): pass

generic1_pytypes = pytypes.typechecked(generic1)
generic1_enforce = enforce.runtime_validation(generic1)
generic1_typeguard = typeguard.typechecked(generic1)
generic1_typo = typo.type_check(generic1)

args = [1], 'b'  # valid signature (A=str)
# %timeit -r7 -n1000 generic1_pytypes(*args)  # FAILS
# %timeit -r7 -n1000 generic1_enforce(*args)  # FAILS
# %timeit -r7 -n1000 generic1_typeguard(*args)  # FAILS
%timeit -r7 -n1000 generic1_typo(*args)
4.19 µs ± 206 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)

Three out of four fail.


And finally...

def generic2(x: Union[A, B], y: A, z: B): pass

generic2_pytypes = pytypes.typechecked(generic2)
generic2_enforce = enforce.runtime_validation(generic2)
generic2_typeguard = typeguard.typechecked(generic2)
generic2_typo = typo.type_check(generic2)

args = 'a', 3, 'b'  # valid signature (A=int, B=str)
# %timeit -r7 -n1000 generic2_pytypes(*args)  # FAILS
# %timeit -r7 -n1000 generic2_enforce(*args)  # FAILS
# %timeit -r7 -n1000 generic2_typeguard(*args)  # FAILS
# %timeit -r7 -n1000 generic2_typo(*args)  # FAILS

All four fail, amen.

aldanor avatar Nov 28 '17 15:11 aldanor

I couldn't help but to notice the absence of my library (typeguard) which predates both pytypes and typo (and which pytypes has borrowed much of its code from).

agronholm avatar Nov 28 '17 15:11 agronholm

I just read through the scoping rules in PEP 484 and it certainly did not cover cases like this. How is a type checker supposed to bind the type variables when the first occurrence is in a Union?

agronholm avatar Nov 28 '17 15:11 agronholm

How is a type checker supposed to bind the type variables when the first occurrence is in a Union?

Not make final conclusions based on the first occurence?..

aldanor avatar Nov 28 '17 16:11 aldanor

How, then, is it exactly supposed to make conclusions? This part was not explained in PEP 484.

agronholm avatar Nov 28 '17 16:11 agronholm

@agronholm I couldn't help but to notice the absence of my library (typeguard) which predates both pytypes and typo (and which pytypes has borrowed much of its code from).

Apologies -- I now remember your library, it's actually the fastest of all three :)

I've added typeguard tests to the examples above.

aldanor avatar Nov 28 '17 16:11 aldanor

How, then, is it exactly supposed to make conclusions? This part was not explained in PEP 484.

My intuition with signature like (Union[A, B], y: A, z: B) and the input (str, int, str) would be like this:

  • (a) First argument is a str, which means that either A is a str, or B is a str
  • (b) Second is an int, which means A is an int, which together with (a) implies that B is a str
  • (c) Third is a str, which happens to be consistent with (b) so everything's ok

You could slightly optimize it by first resolving non-sum-types (although it will not magically help in all case; it can just make most of them faster):

  • (a) Second argument is an int => A = int
  • (b) Third argument is a str => B = str
  • (c) First argument is a str which is consistent with Union[int, str] => QED.

This is kind of what typo tries to do, but there's still quite a bit of work; and there's some limitations.

If you resolve sum-types based on first occurence, this basically implies that Union[A, B] is not resolved the same way as Union[B, A] which doesn't make much sense.

aldanor avatar Nov 28 '17 16:11 aldanor

So far I miss tests here that scope the case that type information in hosted in stubfiles, which are clearly and officially part of PEP 484 specification. Also, no tests involving OOP constructs - methods, classmethods, staticmethods and properties are shown, not yet speaking of inner classes. @aldanor I recommend to file issues for encountered failures in the respective projects. Only this way issues can be solved.

These tests seem to scope rather much on performance, which is for typechecking a secondary design goal at best. typechecking should be disabled outside of testing and debugging phase.

Stewori avatar Dec 05 '17 11:12 Stewori

I'm pretty sure attrs can support this now, with no extra features. I'm willing to lend a hand to any author of a typechecking library to integrate into attrs.

Tinche avatar Apr 22 '18 17:04 Tinche

The typechecker should be kept exchangeable as no framework (for runtime typechecking) gets everything right yet. The fact that the typing module changes heavily from Python version to Python version makes it very challenging to keep up. E.g. Python 3.7 breaks everything again and I wasn't yet able to fix this for pytypes. Unfortunately this distracts from fixing the other issues.

Stewori avatar Apr 22 '18 19:04 Stewori

If I can add another wrench to this. Remember that issue with resolving the types that have strings in them? #265 This would be necessary for any kind of automatic type checking.

euresti avatar Apr 23 '18 14:04 euresti

pytypes can resolve these strings/forward references. The case that such strings occur deeper within a type was supported only a while ago and no release was filed since then. See https://github.com/Stewori/pytypes/issues/22. pytypes also provides a service function pytypes.resolve_fw_decl that resolves forward references from a string or nested somewhere inside a type. Recursion proof.

Stewori avatar Apr 23 '18 22:04 Stewori

There seems to be a new option for runtime type checks y'all: https://attrs-strict.readthedocs.io/en/latest/

hynek avatar Sep 30 '19 11:09 hynek

Or maybe let it run in setter: https://github.com/pwwang/attr_property ?

pwwang avatar Dec 16 '19 05:12 pwwang

There seems to be a new option for runtime type checks y'all: https://attrs-strict.readthedocs.io/en/latest/

Would it be possible to merge this, or are there licensing (or other) concerns?

ghost avatar Oct 16 '20 18:10 ghost

As it stands, don't see for us a reason to merge it, especially because it would mean that we'd have to maintain it too. Currently not looking for more maintenance burden. 🙃 We try to put our energy into making an ecosystem thrive, writing everything ourselves is unrealistic alas.

hynek avatar Oct 17 '20 05:10 hynek

Thank you for your response. I absolutely understand where you're coming from here.

ghost avatar Oct 19 '20 13:10 ghost