typing
typing copied to clipboard
First draft of an update to the Overloads chapter (DRAFT: DO NOT MERGE)
- Attempts to clearly define the algorithm for overload matching.
- Describes checks for overload consistency, overlapping overloads, and implementation consistency.
We typically wait for a proposed spec change to be accepted by the TC prior to writing conformance tests. In this case, I think it's advisable to write the conformance tests prior to acceptance. This will help us validate the proposed spec changes and tell us if (and to what extent) these changes will be disruptive for existing stubs and current type checker implementations.
I would normally volunteer to write the conformance tests, but in this case I think it would be preferable for someone else to write the tests based on their reading of the spec update. If I write the tests, there's a real possibility that they will match what's in my head but not accurately reflect the letter of the spec. There's also a possibility that I'll miss some important cases in the tests. If someone else writes the tests, they can help identify holes and ambiguities in the spec language.
Is there anyone willing to volunteer to write a draft set of conformance tests for this overload functionality? I'm thinking that there should be four new test files:
overloads_definitions: Tests the rules defined in the "Invalid overload definitions" sectionoverloads_consistency: Tests the rules defined in the "Implementation consistency" sectionoverloads_overlap: Tests the rules defined in the "Overlapping overloads" sectionoverloads_evaluation: Tests the rules defined in the "Overload call evaluation" section
If this is more work than any one person wants to volunteer for, we could split it up.
I am willing to work on conformance tests for this, but I probably can't get to it until the core dev sprint, Sept 23-27. I realize that implies a delay to moving forward with this PR. Happy for someone else to get to it first.
I am willing to work on conformance tests for this
Just wanted to note that I haven't forgotten this, it remains on my TODO list, I just wasn't able to get to it during the core dev sprint and haven't been able to prioritize it quite yet :/ But I still plan to do it. I also remain open to anyone else interested picking it up, just comment here first :)
I've completed the first set of tests (for the "Invalid overload definitions" section of the spec.) I just realized that I named it overloads_invalid.py, where you had suggested overloads_definitions.py -- let me know if you feel strongly about this naming, and I can change it.
It's slow going adding these tests, because running python main.py is slow, and I typically want to check the type checker behavior against each test, then update each of the four result toml files, then sometimes tweak the expectations of the test and run it again...
One other limitation of the conformance suite that I've observed is that sometimes the rules differ for stub files vs normal files, but as far as I can see the conformance suite tooling doesn't support a stub file as an actual test file, only as an importable resource for a non-stub file. Has there been prior discussion of lifting this limitation?
the conformance suite tooling doesn't support a stub file as an actual test file
I don't think this has been a requirement yet. The test infrastructure can be updated if necessary.
I believe I've completed the test suite, with reasonably good coverage of everything specified as a "should". I intentionally avoided adding tests either way for behaviors specified as a "may".
I also added the capability to have stub test files in the conformance suite, and added overloads_definitions_stub.pyi, since the rules for valid overload definition are significantly different in a stub file.
I aimed to write tests that reflect the specification as it currently exists in this PR, to help illuminate where type checkers currently do and don't conform to this spec. I commented inline on some points where I wonder if we should adjust the spec.
@carljm, thanks for doing this! I'll try to find time next week to review your test code and update the draft spec if your test uncovered any areas of ambiguity or lack of clarity.
@carljm, thanks for writing the test suite. I reviewed it, and it looks good to me. One thought is that we may want to cover more cases for rule 6 in the overload evaluation algorithm. I know there are cases where mypy and pyright diverge here, and it would be good to suss out some of those divergences and determine the "correct" answer based on the wording in the proposed spec.
I didn't want to write these tests myself because I thought it would be beneficial for someone else to write the tests with a critical eye toward the proposed spec — to look for holes and areas of ambiguity. Other than the points you raised above, did you find any other areas of concern about the proposed spec? If not, would you be in favor of advancing it as a "final" draft for consideration by the full typing council?
I'm curious what specific additional scenarios for step 6 you'd want to test. I think I remember looking at it and not coming up with anything that wasn't already covered in the existing tests. But at that point I was probably fading, so I may well have missed something. Definitely not opposed to adding more tests; maybe you can add them if you have ideas?
I don't think I have areas of concern besides those mentioned above. I think the main thing I realized is that it would ideally be better for clarity of the overload spec if we had a clear spec for call-checking in the first place, since overloads depend on call-checking, and thus this spec currently implicitly depends on some assumptions about behavior of call-checking that aren't specified.
But I don't think that's a reason to delay landing this spec change; it's clearly an improvement on the status quo.
I also think that we should explicitly note in the spec text that the separation of step one (arity checking) and step two (full argument matching) into separate elimination passes is recommended for performance reasons and may make it easier to understand the algorithm, but makes no observable semantic difference (assuming that return types of erroring calls are treated as meaningless, as you advocated above). I think this clarification is useful for helping readers understand the spec. It was a point of confusion for me and I only got clarity on it by working with example cases for awhile.
this spec currently implicitly depends on some assumptions about behavior of call-checking that aren't specified.
Yes, I agree. Definitely more work to do still.
I also think that we should explicitly note in the spec text that the separation of step one (arity checking) and step two (full argument matching) into separate elimination passes is recommended for performance reasons
I'd prefer not to do this. Such a statement seems out of place for a spec. As you've pointed out before, the spec should describe the behaviors, not the implementation. In this case, the behavior is sufficiently complex that it requires us to use a multi-step process. If a type checker implementer wants to combine some of the steps when implementing this support and can prove that the resulting behavior is identical to the behavior described in the spec, that is fine. That's always the case for any behaviors described in the spec.
What is the type of an overloaded function? When can it be assigned to a Callable, and the other way around?
This is already covered in the typing spec here.
Is a Protocol with an overloaded call method an overloaded function type, or does it follow different rules?
This is already covered in the typing spec here.
When assigning an overloaded function to some Callable[Tss, R], then what happens to the individual signatures? Do they live within the Tss paramspec, and does R become causally dependent on Tss, i.e. as a type-mapping? Or does this assignment cause the overloaded type to change into a different function type without overloads?
This is not currently specified. It's out of scope for this PR.
Do I understand correctly that this spec allows decorating an overloaded function in .pyi stubs, as well? Because this is currently not supported in (at least) pyright.
This is already covered in the typing spec here. The short answer is, no, arbitrary decorators should not be used in stub files.
When assigning an overloaded function to some Callable[Tss, R], then what happens to the individual signatures? Do they live within the Tss paramspec, and does R become causally dependent on Tss, i.e. as a type-mapping? Or does this assignment cause the overloaded type to change into a different function type without overloads?
This is not currently specified. It's out of scope for this PR.
Ah ok. The scope of this PR wasn't all that clear to me after reading the PR description. So before I ask anything else; what exactly is within the scope of this PR?
It's touches on multiple topics, which are of course all related, but it isn't all that obvious to me why it isn't required to also include typing.Callable or callable Protocol types when specifying the precise mechanics of overloads. They will likely also be directly affected by the choices that are made here. So I don't see why we shouldn't at least talk about those, so that we can avoid any unwanted (and currently unknown) potential consequences there.
What is the type of an overloaded function? When can it be assigned to a Callable, and the other way around?
This is already covered in the typing spec here.
Is a Protocol with an overloaded call method an overloaded function type, or does it follow different rules?
This is already covered in the typing spec here.
Ok I wasn't aware of these sections. Maybe it could help to link to them from this spec?
The scope of this PR is to describe:
- The way to perform overload matching — how to evaluate call expressions that target an overloaded function
- The way to check for overload consistency, overlapping overloads, and implementation consistency
These new sections rely upon definitions and concepts described elsewhere in the typing spec including assignability, materialization, type equivalency, enums, tuples, etc.
We try hard not to duplicate concept definitions in the spec because duplicate definitions will inevitably get out of sync and cause confusion. This section shouldn't need to say anything about assignability rules for callables or protocols because those are discussed elsewhere in the spec.
We've also endeavored to make concepts in the typing spec as orthogonal and composable as possible. If you see cases where concepts are not composing, those are cases that we should discuss.
How should subtyping overloaded functions behave?
I think there is a simple intuition, which is that f is a subtype of g if every call to f is also a valid call to g. If we follow that intuition, then I would expect the following examples to pass:
Example 1: Subtyping between overloaded callback Protocol and Callable type
from typing import Protocol, Callable, overload, assert_type
class P(Protocol):
@overload
def __call__(self, x: int, y: str, z: int) -> str: ...
@overload
def __call__(self, x: int, y: int, z: int) -> int: ...
def check_expand_union_callable(v: int | str, f: P) -> None:
g: Callable[[int, int | str, int], int | str] = f
ret = g(1, v, 1)
assert_type(ret, int | str)
Example 2: Subclass/sub-protocol consistency
from typing import Protocol, overload, assert_type
class Base(Protocol):
def m(self, x: int, y: int | str, z: int) -> int | str: ...
class Derived(Base, Protocol):
@overload
def m(self, x: int, y: str, z: int) -> str: ...
@overload
def m(self, x: int, y: int, z: int) -> int: ...
def check_expand_protocol(v: int | str, base: Base, derived: Derived) -> None:
ret_base = base.m(1, v, 1)
assert_type(ret_base, int | str)
ret_derived = derived.m(1, v, 1)
assert_type(ret_derived, int | str)
For what it's worth, I think it's reasonable to restrict the union expansion to calls. Union expansion is complicated and has concerning performance implications My understanding is that it is included in the spec because people need it, but if type checkers do not currently implement these rules for subtyping purposes, maybe that's evidence that this need does not extend to subtyping, and we can forgo the rule there entirely?
I agree that type expansion shouldn't affect (infect) subtyping rules. Assignability rules for overloaded callables are already defined in the spec here.
What you're pointing out here is a small inconsistency between call evaluation behavior and the subtyping behavior. This was also recently pointed out by @hauntsaninja in this pyright issue.
We could look at amending the assignability rules to eliminate this inconsistency, but this would come at a big price — in terms of complexity and performance. My sense is that it wouldn't be a good tradeoff.
I was careful in this PR to talk about type expansion only in the context of "argument types". Arguments are applicable only to calls.
While looking at overload resolution for Pyre, I noticed an ambiguity in the spec for Step 2. Specifically, what should we do with call argument expressions which have "internal errors"? That is, evaluating the expression leads to an error, but the expression still results in a type. For example f(g(x)) where g(x) is an error, but returns a type compatible with f.
A worked example:
from typing import overload, assert_type
@overload
def f(x: int) -> int: ...
@overload
def f(x: str) -> str: ...
def f(x: int | str) -> int | str:
return x
def h(x: str) -> str:
return ""
def g(x: str) -> int:
return 0
# Call with error, returns int, select first overload
assert_type(f(g(0)), int)
x = g(0)
assert_type(f(x), int)
# Call with error, return str, select second overload
assert_type(f(h(0)), str)
y = h(0)
assert_type(f(y), str)
Should we specify that overload selection in Step 2 is determined by errors in between the arguments and parameters of the overload signature -- not simply the presence/absence of errors on the call overall?
I noticed an ambiguity in the spec
When evaluating a call to an overloaded function, I think the behavior for "internal errors" should be the same as with calls to non-overloaded functions. That is, if there are type errors detected when evaluating argument expressions, those errors shouldn't affect the evaluation of the call expression itself. I think that's consistent with what you mean by "errors in between the arguments and parameters".
I'm not sure this matters though. Perhaps we can just leave this unspecified. My view is that in cases where a type error is detected and reported by a type checker, any downstream type evaluations that depend on that error are not covered by the spec. Type checkers will generally want to "limit the collateral damage" and reduce downstream false positives once the first error is detected, but I don't think we should try to mandate specific behaviors here. Ultimately, it's up to the user to fix the "inner error" type violation; once that's fixed, then the type checker can guarantee conformant behavior for dependent type evaluations.
If you can think of a situation where an "inner error" is detected but not reported to the user during overloaded call evaluation, this would be a bigger concern because it would effectively change the results of the overloaded call without the user realizing it.