alibi
alibi copied to clipboard
Runtime type checking with beartype
Partially addresses #488 by using beartype
to do runtime type checking.
Currently annotated part of the public API (specifically, explainers in the alibi.explainers
subpackage and methods __init__
, fit
, explain
). The intention is to see if tests pass, but there are quite a few todos:
Essential:
- [x] Provide simple runtime type-checking for the public interface of
alibi.explainers
. Note thatbeartype
is statistical - it assumes containers are homogenous, thus this is not 100% correct or extensive type checking. - [ ] Relax types to allow configuration-driven initialization of explainers. Specific example - relax
tuple
types toSequence[int]
so we can support things like #493.- ~Caveat:
Sequence
type is generic only from Phython 3.9, soSequence[int]
fails for Python < 3.9. Unless we could useSequence
fromtyping_extensions
(need to check this would work).~ I was wrong about this, the restriction is only oncollections.abc.Sequence
. - Relaxing the Python API may result in some (hopefully minor) user confusion as
shape: Tuple[int]
can be a lot more clear thanshape: Sequence[int]
.pydantic
actually has the advantage here because using their decorator@validate_arguments
we could keepshape: Tuple[int]
but lists would be parsed to tuples automatically.
- ~Caveat:
Nice to have:
- [ ] Provide extensive/deep runtime type-checking for the public interface of
alibi.explainers
.beartype
may gain this functionality in the future. Alternatively, we might want custom class methods doing this. - [ ] Narrow down types where relevant and use
beartype
validator functionality, e.g. define custom types likeNDArray2D
to signify only 2-dnumpy
arrays are acceptable forAnchorTabular
To think about and/or out of scope:
- [ ] Validation beyond types - of course we also need custom logic to decide on invalid state on combinations of user arguments which cannot be achieved by simple runtime type-checking even with validation extensions. We should standardize how this is done and what kinds of exceptions are raised (custom exceptions hierarchy?).
- [ ] Return types are also checked/validated by
beartype
. We could narrow down these types to have better control of the return API as well as catch impementation bugs. However, it's potentially not as useful as usingpydantic
for return types because there is another use case forExplanation
objects - namely, persistedExplanation
loaded back intopython
for further inspection, this would need a "model" of theExplanation
and some parsing logic (e.g. lists of lists get converted back intonumpy
).pydantic
is the natural choice here (see earlier attempt https://github.com/SeldonIO/alibi/pull/342). - [ ] It's possible
pydantic
is a decent choice overbeartype
anyway especially if we end up wanting it for return type schemas. Technically,@validate_arguments
can do runtime checks and attempt parsing arguments to their expected types, but it wouldn't extend to more advanced validation like custom types with custom validators. That being said, we could always delegate custom validation to the internals of the validation rather than rely on specific custom types likeNDArray2D
, though specific types (as long it's clear what they are) are nice to document the expectations of the public API.
There appears to be some undesirable interaction with sphinx
autodoc
and beartype
raising an error: beartype.roar.BeartypeDecorHintNonpepException: @beartyped CEM.__init__() parameter "predict" wrapper parameter tensorflow.compat.v1.keras.Model not class.
I think the culprit here is mocking modules such as tensorflow
as autodoc
is run which then results in these runtime exceptions by beartype
as the objects have been intercepted and the types no longer correspond to the expected ones. This is annoying to say the least. A potential solution to this may be to switch fully to sphinx-apidoc
(see #471) which is a static doc builder that doesn't rely on importing modules to be documented, although it would be a reasonably large switch and it remains to be tested whether it is mature enough.
Not related to beartype
, but on the subject of autodoc
, mocked modules, decorators etc I also had trouble with documenting torch.jit.script
decorated functions in alibi-detect
. Lines 319-332 in conf.py were required to fix. I can't see this helping with the above issue, but this is more of a comment to say that I wonder if sphinx-apidoc
would help with issues like this one too...
Thanks so much for investigating @beartype
as your potential runtime type-checker. Supporting alibi
would be quite exciting for us – so I'll do everything within my sad grab-bag of skills to satisfy your use case.
Provide extensive/deep runtime type-checking for the public interface of
alibi.explainers
.beartype
may gain this functionality in the future. Alternatively, we might want custom class methods doing this.
Yup. beartype
currently:
- Shallowly type-checks almost everything that's PEP-compliant.
- Deeply type-checks many things that are PEP-compliant – but significant work remains to be done. We're tentatively targetting
beartype 1.0.0
(to be presumably released late next year) as the first version to deeply type-check all PEP 484- and 585-compliant type hints (e.g.,typing.Dict[str, int]
,dict[str, int]
). - Provides the beartype validator API, enabling you to preempt my slothful dev habits by implementing your own hand-rolled
O(1)
orO(n)
deep type-checking type hints first. Admittedly, you shouldn't have to do that; that's the bare minimum thatbeartype
should implicitly do for you.
We all pay the price for my negligence.
I think the culprit here is mocking modules such as
tensorflow
asautodoc
is run which then results in these runtime exceptions bybeartype
as the objects have been intercepted and the types no longer correspond to the expected ones.
Yikes. That's some fearsomely subtle interaction right there.
As suggested over here, one (possibly) sane solution might be for beartype
to conditionally disable itself when autodoc
is running. After all, is there a valid use case for runtime type-checking to even be performed during documentation building? According to this StackOverflow post, this should be trivial via this two-line snippet:
# If "sphinx-build" is currently running, Sphinx's "autodoc" extension
# is almost certainly importing "beartype" as an unintended side effect
# of documenting a downstream codebase via runtime introspection.
# Since this typically results in disaster, reduce @beartype to a noop.
if Path(argv[0]).name == "sphinx-build" or Path(argv[0]).name == "build.py":
beartype = lambda func: func
It would be nice for us to unit test that. Let's reflect on how that might happen.
Additionally, it should be feasible for alibi
to manually kludge this on your end by monkey-patching beartype
to the identity decorator from within your top-level conf.py
Sphinx configuration: e.g.,
# *VERY* early in your "conf.py" before you import from "alibi":
import beartype
beartype.beartype = lambda func: func # <-- make it so, bro
@jklaise: You probably already know this, but all of the beartype.roar.BeartypeCallHintPepParamException: @beartyped CounterfactualProto.__init__() parameter gamma="100" violates type hint <class 'float'>, as "100" not instance of float.
exceptions should be trivially resolvable by just replacing float
with numbers.Real
from Python's numeric tower.
~~Relatedly, would you happen to have a handy traceback for that unreadable beartype.roar.BeartypeDecorHintNonpepException: @beartyped CEM.__init__() parameter "predict" wrapper parameter tensorflow.compat.v1.keras.Model not class.
exception? The most recent pytest
log output doesn't yield that. Sadness ensues.~~ :crying_cat_face:
~~Commit 5e23081e on our end resolves the erroneous "wrapper parameter"
substring embedded in that exception, but the resulting exception is still unreadable.~~ I think my eye's gonna start twitching again. EDIT: Pretty sure I got it! That last sentence still applies, though.
@jklaise: You probably already know this, but all of the
beartype.roar.BeartypeCallHintPepParamException: @beartyped CounterfactualProto.__init__() parameter gamma="100" violates type hint <class 'float'>, as "100" not instance of float.
exceptions should be trivially resolvable by just replacingfloat
withnumbers.Real
from Python's numeric tower.
Nice catch! I think this is an example where the parameter really should've been a float, beartype
paying dividends already.
@leycec here's a stack trace from a local docs build (I realize it's not on the CI job as beartype
is not a dependency of the job):
WARNING: autodoc: failed to import module 'alibi'; the following exception was raised:
Traceback (most recent call last):
File "/home/janis/.conda/envs/py38/lib/python3.8/site-packages/sphinx/ext/autodoc/importer.py", line 70, in import_module
return importlib.import_module(modname)
File "/home/janis/.conda/envs/py38/lib/python3.8/importlib/__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
File "<frozen importlib._bootstrap>", line 991, in _find_and_load
File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 843, in exec_module
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File "/home/janis/PycharmProjects/public-fork-alibi/alibi/__init__.py", line 1, in <module>
from . import confidence, datasets, explainers, utils
File "/home/janis/PycharmProjects/public-fork-alibi/alibi/explainers/__init__.py", line 9, in <module>
from .cem import CEM
File "/home/janis/PycharmProjects/public-fork-alibi/alibi/explainers/cem.py", line 14, in <module>
class CEM(Explainer, FitMixin):
File "/home/janis/PycharmProjects/public-fork-alibi/alibi/explainers/cem.py", line 17, in CEM
def __init__(self,
File "/home/janis/.conda/envs/py38/lib/python3.8/site-packages/beartype/_decor/main.py", line 159, in beartype
func_wrapper_code = generate_code(func_data)
File "/home/janis/.conda/envs/py38/lib/python3.8/site-packages/beartype/_decor/_code/codemain.py", line 166, in generate_code
code_check_params = _code_check_params(data)
File "/home/janis/.conda/envs/py38/lib/python3.8/site-packages/beartype/_decor/_code/codemain.py", line 373, in _code_check_params
func_wrapper_code_param = pep_code_check_param(
File "/home/janis/.conda/envs/py38/lib/python3.8/site-packages/beartype/_decor/_code/_pep/pepcode.py", line 149, in pep_code_check_param
reraise_exception_cached(
File "/home/janis/.conda/envs/py38/lib/python3.8/site-packages/beartype/_util/cache/utilcacheerror.py", line 200, in reraise_exception_cached
raise exception.with_traceback(exception.__traceback__)
File "/home/janis/.conda/envs/py38/lib/python3.8/site-packages/beartype/_decor/_code/_pep/pepcode.py", line 132, in pep_code_check_param
) = pep_code_check_hint(hint)
File "/home/janis/.conda/envs/py38/lib/python3.8/site-packages/beartype/_util/cache/utilcachecall.py", line 351, in _callable_cached
raise exception
File "/home/janis/.conda/envs/py38/lib/python3.8/site-packages/beartype/_util/cache/utilcachecall.py", line 343, in _callable_cached
return_value = params_flat_to_return_value[params_flat] = func(
File "/home/janis/.conda/envs/py38/lib/python3.8/site-packages/beartype/_decor/_code/_pep/_pephint.py", line 1300, in pep_code_check_hint
hint_curr_expr=add_func_scope_types(
File "/home/janis/.conda/envs/py38/lib/python3.8/site-packages/beartype/_decor/_code/_pep/_pepscope.py", line 343, in add_func_scope_types
die_unless_hint_nonpep_type(hint=cls, exception_prefix=exception_prefix)
File "/home/janis/.conda/envs/py38/lib/python3.8/site-packages/beartype/_util/hint/nonpep/utilnonpeptest.py", line 276, in die_unless_hint_nonpep_type
die_unless_type_isinstanceable(
File "/home/janis/.conda/envs/py38/lib/python3.8/site-packages/beartype/_util/cls/pep/utilpep3119.py", line 112, in die_unless_type_isinstanceable
die_unless_type(
File "/home/janis/.conda/envs/py38/lib/python3.8/site-packages/beartype/_util/cls/utilclstest.py", line 59, in die_unless_type
raise exception_cls(f'{exception_prefix}{repr(cls)} not class.')
beartype.roar.BeartypeDecorHintNonpepException: @beartyped CEM.__init__() parameter "predict" wrapper parameter tensorflow.compat.v1.keras.Model not class.
@leycec I can't really think of a use case for running beartype
whilst building docs either and even if we wanted to take advantage of the importing side-effect that autodoc
does, we would have to deal with these arbitrary mocks which makes it seem even more pointless (I could be wrong though).
The conditional disabling of beartype
during sphinx
run sounds like a good solution to me provided we're sure that we never want to run it during a sphinx
run. I'm not sure I can offer any advice on unit-testing it on beartype
side though.
beartype
paying dividends already.
You are a gentleman and a scholar and I wish you all the academic citations.
here's a stack trace from a local docs build
zomg
Well, that escalated quickly. @beartype
once generated simple tracebacks. Now even I cannot say where this codebase will go next.
how did it come to this
The conditional disabling of
beartype
duringsphinx
run sounds like a good solution to me provided we're sure that we never want to run it during asphinx
run.
Yes. That is the proviso, isn't it? In theory, beartype
could additionally attempt to detect whether autodoc
is currently running and only reduce to a noop in that specific case. In practice, that sounds scary and fragile. I briefly glimpsed into the special hell that is the Sphinx Events API at sphinx.events
. Five minutes later, I fled with the tattered remnants of my sanity and promised myself never to do that again.
Perhaps we'll just brute-force it for now with the crude two-liner shipped above. If and when alibi
(or someone else) justifiably complains, we'll then rethink our stance on dangerous monkey-patching and invent something more robust and less breaky.
Would that satisfy short-term integration concerns – or should I think much harder about this than my sleep deprivation currently warrants? I think when I must, Janis. But I prefer not to. :wink:
@leycec That sounds like a plan! I'm not averse to monkey-patching in client code, but handling this upstream seems like the best solution as I imagine the overlap between projects wanting to use beartype
and using autodoc
is significant.
Super-duper! We'll force this face-melting workaround into our upcoming beartype 0.9.1
point release to be released in just a week or two.
I'll also submit a descriptive StackOverflow question querying the community for a better, stabler, stronger long-term solution that selectively targets only autodoc
rather than the entire Sphinx stack. Until then, thanks again for all the phenomenal enthusiasm for well-typed Python – and don't hesitate to bang on our issue tracker if @beartype
ever starts holding alibi
back from achieving its projected world ML domination.
Sunny Sunday update: in reflection, none of the available approaches for detecting whether Sphinx and autodoc
are currently running are satisfactory. They all invite non-portable, error-prone, and fragile edge cases that will blow up the @beartype
codebase. We already have enough people shaking their clenched fists at us. Right? You know we do.
I've opened a Sphinx feature request kindly requesting a new utility function trivializing this work for us and everyone else. Come leap onto the friendly dogpile! :dog2:
We've new resolved beartype
's prior frustration with Sphinx's autodoc
extension. Oh, yes; it's happening. @beartype
now conditionally detects when it thinks autodoc
is running and silently reduces itself to a noop. This has been extensively tested on our end with a surprisingly robust functional test that passes everywhere except Python ≥ 3.10, where Sphinx's continued ill-advised usage of distutils
raises deprecation warnings. ...so not our problem
I'll be officially releasing this tomorrow with beartype 0.9.1
. This is critical stuff. It is time.
Thanks yet again for your generous consideration, @jklaise. Whether alibi
elects to continue hardening itself with beartype
or opt instead for its crass older brother pydantic
, you've helped us patch up a critical chink in our breastplate. I wish everyone all the best with @SeldonIO... and beyond!
beartype 0.9.1
has landed and with it autodoc
integration. Praise be to alibi
and @jklaise.
@ascillitoe @arnaudvl the discussion regarding float
support needs some more careful consideration.
Currently, python
treats int
as an acceptable substitue for float
, hence tools like mypy
don't complain. This is not the case for e.g. beartype
which is strict, for good reason:https://github.com/beartype/beartype/issues/66 (WARNING: rabbit-hole). See also test-case failures on this PR where we pass int
to float
-hinted functions en masse.
So we're faced with a choice:
- Keep implicitly supporting
int
arguments tofloat
-hinted functions/methods - Be strict and only support
float
arguments tofloat
-hinted functions/methods
Some discussion on pros/cons:
- Keep implicitly supporting
int
arguments tofloat
-hinted functions/methods
- If going with a strict checker like
beartype
would need to update type hints to something likeUnion[int, float]
ornumbers.Real
(or, if going down the above-mentioned rabbit-hole, some other type monstrosity) - There are some considerations when attempting to mingle
int
s withtensorflow
/pytorch
computation graphs that expectfloat
(in fact, they would expectfloat32
). I believe at this point in time most such operations implicitly apply casting to the user input (e.g.int->float32
or evenfloat
->float32
) such that there are no errors, but there may be edge cases? - If the above point is not an issue anymore and similar libraries implicitly accept
int
s forfloat
s, then this approach seems most sensible
- Be strict and only support
float
arguments tofloat
-hinted functions/methods
- This has less potential for ambiguity and maybe less concern to us for dealing with implicit/explicit casting of
int
tofloat
- This is a breaking change in that every script/config file happily using
int
wherefloat
is expected would break (see also test cases failing on this PR).
Personally I'm torn between the two approaches, although option 1. has a slight advantage, primarily because we wouldn't be breaking any existing code and it also seems that accepting int
for float
is fairly standard in similar libraries.
My Fortran upbringing pushes me towards option 2, I still can't help writing a=42.
etc instead of a=42
! I take your point though that breaking changes aren't ideal...
@ascillitoe I'm with you on that! However, I think I'm leaning towards Option 1 as it's more standard within Python (including scientific libraries!) and would also not break existing things or frustrate new users.
I had an idea which turns out is actually very bad (so let's not do it), but for posterity—we could conditionally use beartype
if it exists in the environment and even turn it on/off via an environment variable (https://github.com/posita/numerary does both things). That way alibi
users could turn it on and off (off by default) whereas mlserver
would turn it on by default. Now, sounds good initially, but it's actually bad because someone developing using alibi
won't be able to deploy things using mlserver
without changing their code/config if they've used the more lax types.
@jklaise agreed on all points above. As much as it would be nice to lead the way on being more strict on float
vs int
here, the "un-pythonic" nature of this would probably be overly frustrating for users. Perhaps going with option 1 and slightly more convoluted type hints isn't so bad... We already have complexity in type hints for other things anyway, for example Union[str, os.PathLike]
for filepaths, and tensorflow
vs pytorch
model stuff etc.
There's a slight issue with ForwardRef
s here. I had to adjust type hints like encoder: "Union[tensorflow.keras.Model, torch.nn.Module]"
to encoder: Union["tensorflow.keras.Model", "torch.nn.Module"],
to make them beartype
-friendly, however this messes up the API docs and these hints only show up as ForwardRef
. This may be another reason for looking into #471.
It appears that beartype
also doesn't like non-strict optional handling, i.e. k: int = None
. I guess it doesn't hurt to be explicit everywhere, i.e. k: Optional[int] = None
, though it does add a fair bit of clutter.
There are more serious issues with handling test cases where we mock certain objects, e.g. see this: https://github.com/SeldonIO/alibi/blob/41d348e3b694fc29f72369f3bc8d91c035ffa3d4/alibi/explainers/tests/test_shap_wrappers.py#L1306-L1308 Currently I don't see how we can retain mocks in test code whilst making sure beartype
runs too. There may be a way to do mocking in a way that doesn't violate return types that I'm not aware of.
Thought the fix might be as simple as passing return_type=Explanation
to patch.object
but it is not that simple as it turns out.
Butterfingers, the correct kwarg is return_value
not return_type
(which unhelpfully gets swallowed). Now fixed via passing a dummy Explanation(dict(), dict())
object as a return value to the patched method.
Codecov Report
Merging #511 (3ac9c81) into master (649e211) will increase coverage by
0.07%
. The diff coverage is100.00%
.
@@ Coverage Diff @@
## master #511 +/- ##
==========================================
+ Coverage 82.60% 82.67% +0.07%
==========================================
Files 77 77
Lines 10507 10554 +47
==========================================
+ Hits 8679 8726 +47
Misses 1828 1828
Impacted Files | Coverage Δ | |
---|---|---|
alibi/explainers/ale.py | 60.89% <100.00%> (+0.58%) |
:arrow_up: |
alibi/explainers/anchor_image.py | 93.29% <100.00%> (+0.12%) |
:arrow_up: |
alibi/explainers/anchor_tabular.py | 78.92% <100.00%> (+0.19%) |
:arrow_up: |
alibi/explainers/anchor_text.py | 87.11% <100.00%> (+0.09%) |
:arrow_up: |
alibi/explainers/cem.py | 79.48% <100.00%> (+0.23%) |
:arrow_up: |
alibi/explainers/cfproto.py | 75.68% <100.00%> (+0.14%) |
:arrow_up: |
alibi/explainers/cfrl_base.py | 88.21% <100.00%> (+0.12%) |
:arrow_up: |
alibi/explainers/cfrl_tabular.py | 60.60% <100.00%> (+1.23%) |
:arrow_up: |
alibi/explainers/counterfactual.py | 80.20% <100.00%> (+0.26%) |
:arrow_up: |
alibi/explainers/integrated_gradients.py | 89.13% <100.00%> (+0.08%) |
:arrow_up: |
... and 3 more |
</phew>
Justifiably feels like I'm personally to blame for the migraine now debilitating Janis as we speak. I didn't actually realize that mypy permissively accepts:
-
PEP 563-style deferred annotations like
"Union[tensorflow.keras.Model, torch.nn.Module]"
in the absence of PEP 563 (e.g.,from __future__ import annotations
). That's... fascinating. And horrifying! But we can probably do that too, because we already have a perfect PEP 563 resolver. Where mypy goes,beartype
reluctantly follows. That said, does mypy actually support that? I couldn't find a discussion of that behaviour in PEP 484 – but that doesn't necessarily mean anything. Mypy does many non-standard things. I've learned to accept that. - Implicitly optional hints (e.g.,
k: int = None
rather thank: Optional[int] = None
). We can definitely do that, too. Trivial, right? Sadly, Guido doesn't want us to:
A past version of this PEP allowed type checkers to assume an optional type when the default value is None, as in this code:
def handle_employee(e: Employee = None): ...
This would have been treated as equivalent to:
def handle_employee(e: Optional[Employee] = None) -> None: ...
This is no longer the recommended behavior. Type checkers should move towards requiring the optional type to be made explicit.
We're shameless Guido sycophants over at @beartype. So, we'd better do what sempai suggests.
Thanks so much for sticking with this gruelling transition, Janis. If we can do anything to ease your considerable suffering here, please pound on our issue tracker and it shall be done.
The move to strict Optional
checking as suggested by ex-BDFL himself seems sensible so I will prepare a commit (to be cherry-picked into master) that excises the strict_optional = False
from our codebase.
Now that the tests are all passing it's time to think about next steps. The following should be addresses still:
- [ ] Relax
Tuple
types toSequence
- this is essential to support specifying parameters from a config - [ ] Decide if we want the type checking to be optional - I think we probably don't want this as it could lead to more headaches than necessary. The only reason we may consider it is to avoid breaking some existing application code, see next point
- [ ] Type strictness (somewhat related to points 1. and 2.), especially to do with
floats
. Previously I said that moving to strictfloat
is in contrast with general Python practices and has some potential to break existing application code. Since then my opinion has changed somewhat and now I think we should go ahead with strictfloat
requirement for the main reason that it frees us from doing even more type casting (especially withtensorflow
andpytorch
sometimes being picky aboutint
vsfloat
. This would basically mean the following:- Any interaction with
alibi
public API would require strictfloat
(either coming from Python or read from a config file) - Any other code using
alibi
would need to ensure the type conversion is done on their side (e.g. imagine some 3rd party processing function that returns integers which need to be interpreted as floats by alibi)
- Any interaction with
Keen to get your thoughts @ascillitoe @sakoush.
To put a long story even longer, here's a summary of some internal discussions as we wrestle with what we want to actually achieve here.
Lay of the land
Fundamentally, the usage of a runtime type-checker boils down to the following question: Do we want alibi to a. Define strict type requirements for its public interface and only accept those types? b. Define type requirements, but cast provided arguments to the required types when possible?
As no surprise to anyone, beartype
is perfect for 1. and pydantic
(mostly) solves 2.
Now onto the considerations, and a toy example where we want num: float
as an argument to the public interface.
Example 1
Using beartype
and num: float
typing. This is strict, non-permissive, if num
is an int
then that is not acceptable. The user/application developer is responsible for type correctness and parsing to the type expected by alibi
. This is an example of (a) above.
Example 2
Using pydantic
and num: float
typing. This is non-strict, permissive, it is encouraging float
but it will attempt to cast non-floats to the expected type and succeed in many cases, e.g. if int
is passed. The user/application developer can be somewhat more lax about ensuring type correctness here (not desirable behaviour to encourage, but it may be necessary in some scenarios, e.g. shape: Tuple
is not possible to define from within json
as json
has no concepts of tuples, only lists, and to avoid an intermediate parsing layer in user space we would need to extend the alibi type to shape: Sequence
and parse intenally via shape=tuple(shape)
). This is an example of (b) above.
Example 3
This one is interesting. Here we want to use beartype
but accept broader types to get something similar to Example 2. So we say num: numbers.Real
. This is still strict and non-permissive, but the type is broader, accepting both int
and float
. The important part: if the private API actully, truly requires a float
, then alibi
needs to cast it manually if we allow broader types in the public API. beartype
is not a parsing library unlike pydantic
so the onus is on us to cast. (This suggests that there are two categories of "broadened" type signatures - ones that truly can be either type internally and ones that are only broadened for convenience on the user/application developer part, but we still need to manually cast inside).
Breaking existing user code now
It was already mentioned above, but just to reiterate. If we go with something like Example 1, this has some potential for breaking existing user flows, e.g. someone relying on passing int
s when float
s are expected due to the permissiveness of Python itself as well as our non-existent runtime type-checking. I've come around to think that this is not such a big deal™ as users should be using the types that are specified in the public API rather than rely on implicit type casting, even if it's Python sponsored (note that this last sentence basically also dismisses solution (b) and pydantic
approach to implicit casting...).
On changing types
It was raised that being overly strict with types might lead to breaking changes if we want to change types in the future. This is interesting because there are multiple ways one could change types:
- Broaden an existing type (e.g.
num: float
tonum: numbers.Real
). This is safe from breaking user code because we are just broadening a type. This also encourages a pattern of "start strict, broaden later" as it's much easier to broaden types than to narrow them. - Change an existing type (e.g.
num: float
tonum: int
). Now this is no longer safe from breaking user code, although in effect we've changed the parameter's meaning without changing it's name which is probably bad form. In any case, changing parameter names can be nicely deprecated without breakage by using some decorator magic, it remains to be seen if something similar can be achieved with changing types by some clever use of introspection. TODO
Other considerations
Other pydantic
use cases
As already mentioned previously, pydantic
is a nice tool for writing data models and parsing data that are guaranteed to satisfy them. We are very likely to use pydantic
to formalize the return API of our explain
methods in the future. Parsing here is especially useful because we might imagine a scenario where we read a serialized explanation json
into it's python
form which would require, for example, parsing nested lists into numpy
arrays which pydantic
can do.
This is not to say that we should stick to a principle of not introducing too many dependencies as I don't think it's necessarily a good argument here, there's no particular reason why we couldn't use both beartype
and pydantic
but for different reasons.
Performance and statistical type checking
Some questions were asked about beartype
's approach to statistical (read: non-deterministic) type checking for nested containers. As explained in their magnificent readme this is due to natural trade-offs between deep run-time type-checking in a dynamic language like Python on potentially the same object over and over again. That being said, maybe our arguments are never so complex and it might be more important to validate these fully (This can also be possible with beartype
, albeit at the cost of switching to validators which is not as nice and standard as relying on PEP 484 type hints. Native support may come in the future). As a result beartype
prides itself on O(1)
runtime typechecking.
On the other hand, pydantic
would do (afaik) deep type-checking. Again, this may not actually be an issue for typical argument types we may have.
Handling forward-refs and validators
As mentioned previously, pydantic
validate_arguments
decorator currently supports neither.
More realistic types
The num: int
example is good to illustrate the fundamental questions, but perhaps using more complex types from the public alibi
API would reveal more. TODO
@jklaise: Wow! My profuse apologies for missing the entirety of your vigorous debate. I blame only my own laziness. Video games are the death of productivity yet I continue to play them. This must be what a vicious circle looks like.
So... I have possibly momentous things to say. Since this issue has been justifiably closed, I say these things into the illimitable void of GitHub with no expectation of any coherent reply by anyone – just as it should be:
- @beartype has explicitly supported the PEP 484-compliant implicit numeric tower (i.e., the implicit type expansion of
int
toint | float
and ofcomplex
tocomplex | int | float
) since @beartype 0.12.0 ...so, for a year or two via ourbeartype.BeartypeConf.is_pep484_tower
setting. Because I am lazy (see above), I let my docos do my talkin':
# Import the requisite machinery.
from beartype import beartype, BeartypeConf
# Dynamically create a new @beartowertype decorator enabling the implicit numeric tower.
beartowertype = beartype(conf=BeartypeConf(is_pep484_tower=True))
# Decorate with this decorator rather than @beartype everywhere.
@beartowertype
def crunch_numbers(numbers: list[float]) -> float:
return sum(numbers)
# This is now fine.
crunch_numbers([3, 1, 4, 1, 5, 9])
# This is still fine, too.
crunch_numbers([3.1, 4.1, 5.9])
- @beartype and Pydantic are BFFLs (Best Friends for Life)! Okay. We're not. We probably despise each other and fervently hope for an ignominious and flaming end to the other guy. Honestly, I'm more mystified by Pydantic than anything. I really liked where they were going, but the shift to Rust is an itchy head-scratcher. Nonetheless, @beartype and Pydantic are both PEP-compliant type checkers and thus mutually compatible. We happily coexist. We're so different from one another that we don't even have much to say to one another. In fact, this leads us directly to...
-
beartype.claw
. In our upcoming @beartype 0.15.0 release, ...to be released shortly for certain definitions of "shortly", @beartype will become the first usable hybrid runtime-static type-checker (i.e., something altogether more powerful than traditional runtime type-checkers like Pydantic or traditional static type-checkers like mypy andpyright
). The hype train that will surely disappoint everyone is boarding.
In short, our new beartype.claw
API will enable everyone to subject one or more arbitrary packages (including literally all packages) to both runtime and static type-checking with a single line of code at runtime. If none of that makes sense, it shouldn't, because none of that meaningfully existed before @beartype 0.15.0. Consider this alibi
-friendly example:
# In your "alibi.__init__" submodule:
from beartype import BeartypeConf
from beartype.claw import beartype_this_package
# This single line of code installs an import hook subjecting the entirety of alibi to
# both runtime and static type-checking at runtime. Like standard static type-checkers,
# this type-checking has no associated runtime costs. Unlike standard static type-checkers,
# this type-checking is enforced at runtime. Mypy and pyright are ignorable. This isn't.
beartype_this_package(conf=BeartypeConf(is_pep484_tower=True))
# In another arbitrary "alibi.some_module" submodule:
good_int: int = 0.30 # <-- this is fine, because you've enabled the implicit numeric tower
bad_int: int = "Pretty sure this isn't an integer." # <-- this is definitely not fine
What will happen when the user imports the fake alibi.some_module
submodule I just awkwardly made up? A runtime type-checking violation, because @beartype statically analyzed the value assigned to the bad_int
global as violating its type hint int
. @beartype does this at import time via the aforementioned import hook installed by the initial call to the beartype_this_package()
function. Note also the distinct lack of any explicit calls to anything from @beartype other than the initial call to the beartype_this_package()
function.
That's it. A single line of code just type-checked the entirety of alibi at runtime, which nobody can ignore.
All of everything above currently works as advertised. Admittedly, I should just release all that already with @beartype 0.15.0. The perfect became the enemy of the good and I can't stop hunting obscure bugs submitted by my favourite users. This must be what it feels like when bear-cats cry. :crying_cat_face:
@leycec great to hear from you! I've been following @beartype developments pretty closely and am becoming ever more impressed by what you've accomplished. I was aware of the support of the PEP484 implicit numerical tower (much appreciated as the word of Guido and thus user expectations appear to be law). I'm also very excited about the upcoming import hook and have been following some of the implementation discussion, here's hoping 0.15 lands soon™.
Whilst I have your attention, something I've been meaning to ask on @beartype, do you have some thoughts on the philosophy of the application of @beartype in applications vs libraries and in public APIs vs private ones? I've been thinking something along these lines:
- as an application developer things are quite straightforward, I can choose to implement @beartype gradually or all at once in my application to act as a sort of user input validation layer for public APIs and a sort of internal consistency layer for private APIs (if I'm feeling skittish, I might only implement @beartype in my test suite)
- as a library developer things are a bit less clear to me:
- once
beartype_this_package
comes out, it seems a no-brainer to run it in a test suite to catch both incorrect tests and reveal some deficiencies in the implementation (as I had discovered further up this thread already) - but shipping @beartype to users is a big step up - using it on public APIs would make it a de-facto validation layer for user input which is probably good in the wild west of runtime Python and means we don't have to maintain heaps of type validation code. But shipping it for private APIs too is less clear - if an error does occur, it would point to an implementation error that we missed in tests and for the user the error might not be clear enough coming from library internals rather than the public API
- once
- Suppose we ship @beartype to users as our de-facto validation layer for user inputs. Now although I have nothing against documenting that our public API raises
BeartypeException
for misformatted input, might it be possible to wrap these errors in library-specific exceptions, i.e. we might want to raise something likeAlibiTypeError
from our public API whilst the originatorBeartypeException
is lurking beyond the surface as an implementation detail.
Aww! Thanks for such kind GitHub words and the good summertime vibes, @jklaise. You are following us closer than even I suspected. Unsurprisingly, you know everything already and are at least five steps ahead.
Let's see if I can answer anything to anyone's satisfaction:
...do you have some thoughts on the philosophy of the application of @beartype in applications vs libraries and in public APIs vs private ones?
Yes. Use @beartype everywhere. That is the philosophy. :smile:
Admittedly, that was a trash answer. Let's see if I clarify what @leycec would do in these diverse use cases that demand that something useful be done:
...as an application developer things are quite straightforward, I can choose to implement @beartype gradually...
</nods>
Absolutely. The gradual way would probably be to incrementally, laboriously, and manually continue decorating all callables and classes of interest by the @beartype
decorator. I myself no longer have time, money, or patience for the gradual way, because "gradual" really does mean graaaaaaaaadual. Manual decoration across million-line codebases is truly a Sisyphean chore that cripples with my shoulder and wrists with arthritic RSI. beartype.claw
: save my body now.
Admittedly, I don't have money either way. But at least I'm more likely to have money if I devote scarce resources to tasks other than @beartype
boilerplate. Maybe. Hopefully. That's the idea, anyway. Don't fail me now, capitalism!
...or all at once in my application.
</nods_nods>
Absolutely. This is the beartype.claw.beartype_all()
way – an import hook that subjects an entire app stack (i.e., all packages imported under the active Python interpreter, including both standard modules and packages in Python's standard library and third-party modules and packages) to hybrid runtime-static type-checking by @beartype. Let's pretend that works as advertised.
once
beartype_this_package
comes out, it seems a no-brainer to run it in a test suite to catch both incorrect tests and reveal some deficiencies in the implementation
</nods_nods>
This is the way. When you're five steps ahead, you're five steps ahead.
but shipping @beartype to users is a big step up - using it on public APIs would make it a de-facto validation layer for user input which is probably good in the wild west of runtime Python and means we don't have to maintain heaps of type validation code. But shipping it for private APIs too is less clear - if an error does occur, it would point to an implementation error that we missed in tests and for the user the error might not be clear enough coming from library internals rather than the public API
...heh. You are not wrong, of course. I'm pretty sure you rarely are.
Thankfully, since @beartype acts entirely at runtime, you have the full array of Python's might at your disposal. One sly approach to dealing with type validation woes (both expected and unexpected) might be to expose a simple public configuration API at the alibi
package level, enabling users to conditionally enable or disable runtime type-checking across your entire package as they see fit. If you're so inclined, a package-specific environment variable would do the trick:
# In your "alibi.__init__" submodule:
from os import environ as _environ
# If the ${ALIBI_IS_BEARTYPED} shell environment variable is truthy...
if _environ.get('ALIBI_IS_BEARTYPED', True): # <-- True by default for maximal truth
# Try to @beartype yourself. Since @beartype would then probably be just an
# optional dependency of alibi, you'd rather silently ignore the lack of
# @beartype here (than raise a fatal exception that everyone would hate).
try:
from beartype import BeartypeConf
from beartype.claw import beartype_this_package
beartype_this_package(conf=BeartypeConf(is_pep484_tower=True))
except:
pass
Ever more elaborate schemes are also at your beck and call – from alibi
-specific configuration file settings in pyproject.toml
to a full-blown alibi
plugin API implemented with a confusing and unmaintainable object-oriented hierarchy of mixins, metaclasses, and dangerously dynamic behaviour. Personally, I'm a fan.
Ultimately, the ultimate power is in your hands. Praise be to alibi
.
...we might want to raise something like
AlibiTypeError
from our public API whilst the originatorBeartypeException
is lurking beyond the surface as an implementation detail.
Yes! So much yes! This is an excellent feature request. I've been intending to do that for a few years. Then someone submits a competing feature or stupefying issue and I'm suddenly off at the races and failed to do anything about this.
Ideally, @beartype should expose a new BeartypeConf.violation_cls
setting enabling wonderful downstream consumers such as yourself to substitute your own package-specific exception class in lieu of our default beartype.roar.BeartypeCallHintViolation
class: e.g.,
beartype_this_package(conf=BeartypeConf(
is_pep484_tower=True,
violation_cls=AlibiTypeError,
))
Gently prod me with a grubby stick if I fail to do something about this, which I will. I'm just that kinda GitHubber. :face_exhaling: