trio icon indicating copy to clipboard operation
trio copied to clipboard

Add `CancelScope.relative_deadline`. Change `fail_after`&`move_on_after` to set deadline relative to entering

Open jakkdl opened this issue 1 year ago • 7 comments

~~Partially deals with #2512 by documenting current behavior.~~

This PR now works to fully resolve #2512 by adding a relative_deadline attribute to CancelScope, which is then used by fail_after and move_on_after. Upon entering the CancelScope the absolute deadline is then calculated as min(deadline, current_time() + relative_deadline).

This PR also reverts an old workaround in timeouts.py due to pycharm being unable to resolve @contextmanager for type checking.

TODO:

  • [ ] undo changes to/fix docstrings of fail_after and move_on_after
  • [ ] write docstring for relative_deadline
  • [ ] resolve None vs math.inf for no deadline
  • [ ] settle on behavior of relative_deadline after entering the cm. See https://github.com/python-trio/trio/issues/2512#issuecomment-2180329727

jakkdl avatar Jun 06 '24 08:06 jakkdl

Codecov Report

Attention: Patch coverage is 96.87500% with 4 lines in your changes missing coverage. Please review.

Project coverage is 99.58%. Comparing base (b834f73) to head (9961abc). Report is 33 commits behind head on main.

Files with missing lines Patch % Lines
src/trio/_core/_run.py 92.59% 2 Missing and 2 partials :warning:
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3010      +/-   ##
==========================================
- Coverage   99.60%   99.58%   -0.02%     
==========================================
  Files         121      121              
  Lines       17882    17992     +110     
  Branches     3214     3245      +31     
==========================================
+ Hits        17811    17917     +106     
- Misses         50       52       +2     
- Partials       21       23       +2     
Files with missing lines Coverage Δ
src/trio/_core/_tests/test_run.py 100.00% <100.00%> (ø)
src/trio/_tests/test_timeouts.py 98.46% <100.00%> (+0.81%) :arrow_up:
src/trio/_timeouts.py 100.00% <100.00%> (ø)
src/trio/_core/_run.py 99.02% <92.59%> (-0.37%) :arrow_down:

codecov[bot] avatar Jun 06 '24 09:06 codecov[bot]

(ignore the commit history, I created the branch in a messy way. Will squash the PR anyway)

jakkdl avatar Jun 06 '24 09:06 jakkdl

Per my comment on the issue,

I think it would make sense for trio.fail_after(x) to be relative to entering the context manager; trio.fail_at(trio.current_time() + x) seems a more natural way to express the relative-to-construction-time semantics. (and the same for trio.move_on_{at,after})

Either way, we should document the chosen semantics.

While I think that documenting this is a net improvement, it might also lead people to rely on the creation-time semantics for *_after in a way that makes changing that to enter-time semantics harder later. Would you be up for making that change and documenting it as a one-step PR?

Zac-HD avatar Jun 07 '24 18:06 Zac-HD

Per my comment on the issue,

I think it would make sense for trio.fail_after(x) to be relative to entering the context manager; trio.fail_at(trio.current_time() + x) seems a more natural way to express the relative-to-construction-time semantics. (and the same for trio.move_on_{at,after}) Either way, we should document the chosen semantics.

While I think that documenting this is a net improvement, it might also lead people to rely on the creation-time semantics for *_after in a way that makes changing that to enter-time semantics harder later. Would you be up for making that change and documenting it as a one-step PR?

Sure. Started thinking about different ways of implementing it in the issue: https://github.com/python-trio/trio/issues/2512#issuecomment-2157955978

jakkdl avatar Jun 10 '24 10:06 jakkdl

  1. See https://github.com/python-trio/trio/issues/2512#issuecomment-2180329727 for discussion on the behavior of relative_deadline after entering.
  2. I personally prefer None over math.inf quite a bit, but changing the behavior of CancelScope.deadline would be breaking for anybody accessing the value of the property in the inf/None case - so I should probably just change relative_deadline to also use inf instead of None.
  3. Need to write docstring for relative_deadline, but can probably mostly just take the stuff I wrote in the newsfragment.
  4. Need to update docstrings for move_on_after and fail_after to be explicit about deadline being relative to entering. (i.e. undo/invert what was originally changed in this PR).

The pycharm bug referred to in timeouts seem to be fixed https://youtrack.jetbrains.com/issue/PY-36444/PyCharm-doesnt-infer-types-when-using-contextlib.contextmanager-decorator and released over a year ago, so I think we can drop the workaround... except mypy now behaves slightly different in one test (?).

jakkdl avatar Jun 20 '24 20:06 jakkdl

RTD is failing because towncrier appends link to the github issue at the end of the newsfragments, but since we're inside a code block that leads to invalid python syntax. Trying to append newlines to the rst file doesn't help, they get stripped. Not sure the code blocks are really needed, could probably just move that stuff into the docstrings, but lets deal with that when we've decided on the precise implementation.

jakkdl avatar Jun 24 '24 15:06 jakkdl

I gave it a shot to instead implement this as a transforming class. The basic implementation is way cleaner, but the big downside is that it will break code that currently does

cs = trio.move_on_after(5)
with cs:
  cs.deadline = 7  # AttributeError: _RelativeCancelScope does not have an attribute `deadline`
# you need to instead write
with cs as cs2:
  cs2.deadline = 7

There's a couple ways to fix that, though it's gonna be somewhat ugly. Either properties/methods that mirror the properties/methods of CancelScope, or dunder magic with __getattr__/__setattr__. We can make them invisible to type checkers and raise warnings though, and error if the user tries to access deadline before the scope is entered (if timeout_from_enter=True).

jakkdl avatar Jun 26 '24 13:06 jakkdl

Wrote a section in the documentation and added _RelativeCancelScope. We might want to consider re-export it as a non-hidden class for the sake of making type hints more ergonomic. Updated newsfragments and docstrings to match current implementation.

The docs and docstrings wouldn't complain about a second pass to improve phrasing, but otherwise I think this is mostly ready to be merged.

jakkdl avatar Sep 05 '24 09:09 jakkdl

I have no clue why verifytypes is complaining about fail_at, and cannot repro it locally. But might want to restore the previous workaround, also because the documentation now displays the return type of the function as Generator[trio.CancelScope, None, None]

jakkdl avatar Sep 05 '24 09:09 jakkdl

(working on fixing the TODO for accessing deadline&shield on _RelativeCancelScope in case somebody is quick on reviewing)

jakkdl avatar Sep 05 '24 09:09 jakkdl

Perhaps somewhat overkill on transitional functions, but this should be very robust.

  • I feel like supporting cancel() on unentered relative cancel scopes is sort of overkill, but think it can be implemented fine - unless the time at which it's actually cancelled matters.
  • should shield raise AttributeError after entering, if called on the _RelativeCancelScope?

Opened https://github.com/microsoft/pyright/discussions/8905, think we're gonna end up ignoring it unless there's a quick fix released.

jakkdl avatar Sep 05 '24 11:09 jakkdl

The pyright --verifytypes issue has been fixed in https://github.com/microsoft/pyright/pull/8924 so just need to wait for version 1.1.380

jakkdl avatar Sep 09 '24 12:09 jakkdl

I'm sorry this has taken me so long to get to; vacation & illness took their toll! I'm aiming for a full review over the weekend, but here are some initial thoughts while I have them:

  • I think that having two transitions (add deprecation, remove deprecation) is worse for users than just switching immediately to the new behavior. Anyone who wants the previous behavior can already switch fail_after(n) to fail_at(current_time() + n) and have that backwards-compatible.
  • I have a fairly strong aesthetic preference to only have a CancelScope class, adding new is_relative: bool | None and relative_deadline attributes.
    • deadline and relative_deadline become properties, with setters that adjust a shared internal state in the obvious way - once the scope has been entered this is easy.
    • before the scope is entered, I'd make it an error to access the relative-deadline on a non-relative scope. Accessing the deadline on a relative scope before entry can raise a deprecation warning; assigning is deprecated and converts to a non-relative scope.
    • better for introspection in IDEs, avoids the reassignment issues above, avoids duplicating some tricky code that might get out of sync later, etc.
  • Assigning an inspect.Signature() object to the __signature__ attribute of a callable does the trick for Sphinx, and can be guarded by an if "sphinx" in sys.modules: to save the important latency the rest of the time. Neat trick to keep typecheckers and docs users happy 😉 (edit: to address https://github.com/python-trio/trio/pull/3010#issuecomment-2331056189 )

Zac-HD avatar Sep 12 '24 04:09 Zac-HD

I just realized:

my_move_on_at = trio.move_on_at(5)
my_move_on_after = trio.move_on_after(5)
my_fail_at = trio.fail_at(5)
my_fail_after = trio.fail_after(5)
print(my_move_on_at.shield) # works
print(my_move_on_after.shield) # works
# these give AttributeError: _GeneratorContextManager object has no attribute shield
print(my_fail_at.shield)
print(my_fail_after.shield)

due to fail_at & fail_after being implemented as wrapping context managers. If people start passing around relative cancel scopes this might pop up as a pain point.

jakkdl avatar Sep 12 '24 11:09 jakkdl

I'm sorry this has taken me so long to get to; vacation & illness took their toll! I'm aiming for a full review over the weekend, but here are some initial thoughts while I have them:

:heart:

  • I think that having two transitions (add deprecation, remove deprecation) is worse for users than just switching immediately to the new behavior. Anyone who wants the previous behavior can already switch fail_after(n) to fail_at(current_time() + n) and have that backwards-compatible.

The downside to this is it being a silent breaking change, and if a library starts relying on it they will have differing behavior depending on what trio version it runs against. So will definitely need a "major" version bump.

  • before the scope is entered, I'd make it an error to access the relative-deadline on a non-relative scope. Accessing the deadline on a relative scope before entry can raise a deprecation warning; assigning is deprecated and converts to a non-relative scope. This does resolve some of the instances of silent breakage, though not all. I'm not sure I think a DeprecationWarning is
  • better for introspection in IDEs, avoids the reassignment issues above, avoids duplicating some tricky code that might get out of sync later, etc.

This seems fine, and if end users really do want different typing between absolute and relative scopes they can get it with a protocol with is_relative: Literal[True] (which we could even add ourselves)

  • Assigning an inspect.Signature() object to the __signature__ attribute of a callable does the trick for Sphinx, and can be guarded by an if "sphinx" in sys.modules: to save the important latency the rest of the time. Neat trick to keep typecheckers and docs users happy 😉

For which issue was this?

I pushed a commit that changes the implementation, I have not updated all docs/docstrings/newsfragments pending further changes.

jakkdl avatar Sep 12 '24 12:09 jakkdl

edited my comment above to point to the possible docs issue. I wouldn't include the protocol, if anyone wants that they can define it downstream.

Zac-HD avatar Sep 12 '24 22:09 Zac-HD

The downside to this is it being a silent breaking change, and if a library starts relying on it they will have differing behavior depending on what trio version it runs against. So will definitely need a "major" version bump.

Yeah, not sure this is great. Even across a large version bump. ... Is there any actual use case for constructing move_on_after before the async with? If not, we could just warn about the deprecation for that specifically (check whether CancelScope was constructed too long before __enter__, no way to opt out), and not have to un-deprecate anything! (after a version or two, we could remove the deprecation)

A5rocks avatar Sep 13 '24 08:09 A5rocks

Annoyingly, I think there's no good reason to construct a move_on_after in advance yet: with the current semantics you can more clearly express the behavior with a move_on_at(current_time() + n). Unfortunately the whole reason we're doing this is that you might want to express something like "when you eventually send this request, fail if it takes more than five seconds".

I guess we could make a deprecation-only release and then un-deprecate it when the semantics change, but I think it's less disruptive overall to just switch to the new behavior 😕

Zac-HD avatar Sep 13 '24 19:09 Zac-HD

I guess we could make a deprecation-only release and then un-deprecate it when the semantics change, but I think it's less disruptive overall to just switch to the new behavior 😕

That's actually a great way to do this. Though if users want relative timeouts and realize this upon seeing the deprecation message, then they have no real alternative (other than restructuring things) which does suck. (... which is better than just silently being incorrect? but not as good as silently becoming correct -- the deprecation is for people who don't want relative timeouts).

... Alternatively, we could mention passing relative_timeout to CancelScope in the deprecation message and just hope that having to turn trio.move_on_after(5) into trio.CancelScope(relative_timeout=5) isn't very common. (not great, though stylistic things like that can always be caught by flake8-async once deprecation period is over).

A5rocks avatar Sep 13 '24 19:09 A5rocks

It just seems extremely strange for someone who doesn't want a relative timeout to use move_on_after(), so I still favor a one-step update where we just make that do the right thing. I don't think this will affect many users at all.

Good call on the linter though: we could have a lint rule against calling *_after() without immediately entering it, which only fires if the Trio version is older than whatever this patch ends up being. And then add a config option for trio_version (and anyio_version?), which defaults to whatever is installed.

Zac-HD avatar Sep 13 '24 21:09 Zac-HD

Yeah ok I agree that it's fine to do this without deprecation. It's not that big of a behavior change (your code runs for just a bit longer) and deprecation strategies range from counter-productive to very weird (making people use the relative_timeout parameter). And if the linter could catch old usages that would be very nice.

A5rocks avatar Sep 13 '24 22:09 A5rocks

Good call on the linter though: we could have a lint rule against calling *_after() without immediately entering it, which only fires if the Trio version is older than whatever this patch ends up being. And then add a config option for trio_version (and anyio_version?), which defaults to whatever is installed.

I think gating it on Trio version isn't great, because it's fairly easy to get mismatched versions between what your linter/ci setup is running (very easy if you're using pre-commit), and if you're supporting a range of trio versions you almost surely won't have different environments that lints against different versions of dependencies. So I'd go with a default-enabled check, that mentions that you should disable it if you only support Trio>0.x (and at some point we might default-disable that check) Opened https://github.com/python-trio/flake8-async/issues/290, we can have further linter-specific discussions in there.

jakkdl avatar Sep 16 '24 10:09 jakkdl

* Assigning an `inspect.Signature()` object to the `__signature__` attribute of a callable does the trick for Sphinx, and can be guarded by an `if "sphinx" in sys.modules:` to save the important latency the rest of the time.  Neat trick to keep typecheckers _and_ docs users happy 😉
  (edit: to address [Add `_RelativeCancelScope`. Change `fail_after`&`move_on_after` to set deadline relative to entering #3010 (comment)](https://github.com/python-trio/trio/pull/3010#issuecomment-2331056189) )

Neat! But I now realized we in fact probably shouldn't do this because it does actually matter: https://github.com/python-trio/trio/pull/3010#issuecomment-2346000138

The only way (other than adding a wrapping _FailingCancelScope :upside_down_face:) I can see to fix it is adding a parameter to CancelScope that makes it raise TooSlowError when canceled. But that's perhaps for another issue

jakkdl avatar Sep 16 '24 11:09 jakkdl

Once https://github.com/python-trio/flake8-async/pull/292 is merged I'll update the newsfragment to refer to ASYNC122, otherwise all docs/docstrings/etc should be good now.

jakkdl avatar Sep 18 '24 12:09 jakkdl