trio Clarify docs on checkpoints in async iterators

https://github.com/python-trio/trio/blob/077e8fc3f1634b42ef5024814898439d8a8430d4/docs/source/reference-core.rst#L92-L95

Does this mean exactly one checkpoint after the last iteration, or at least one? I'm pretty sure it's the latter but that's not entirely obvious from reading, and adding one of those specifiers would remove any doubt🙂

Aug 02 '22 08:08 Zac-HD

I think what I was actually going for was

each pass through the loop has a least one checkpoint
if the iterator is empty, there's at least one checkpoint

b/c the idea is that if you see async for you should feel confident that some checkpoints are happening in this code.

We definitely don't put an upper bound on how many checkpoints can happen anywhere; that's not a user-observable fact about any function call.

On Tue, Aug 2, 2022 at 1:26 AM Zac Hatfield-Dodds @.***> wrote:

https://github.com/python-trio/trio/blob/077e8fc3f1634b42ef5024814898439d8a8430d4/docs/source/reference-core.rst#L92-L95

Does this mean exactly one checkpoint after the last iteration, or at least one? I'm pretty sure it's the latter but that's not entirely obvious from reading, and adding one of those specifiers would remove any doubt🙂

— Reply to this email directly, view it on GitHub https://github.com/python-trio/trio/issues/2388, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAEU42GAA2COUXAT6KWEGVTVXDLUBANCNFSM55KHOIEQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>

-- Nathaniel J. Smith -- https://vorpus.org http://vorpus.org

Aug 02 '22 20:08 njsmith

What about the following example? It's OK according to your comment, but not according to the docs:

async def aiterable():
    yield "some value"
    await trio.sleep(0)

Aug 02 '22 20:08 Zac-HD

The docs are talking about code provided by trio. Obviously trio doesn't have a say about your own code having checkpoints or not.

Aug 04 '22 08:08 agronholm

I think by Nathaniel's comment the docs should be changed to represent that lack of an upper bound. I'm not really sure how the number of checkpoints really affects a piece of code [1] but the docs should be consistent.

edit: [1] What I mean is, as far as I know, the code can't tell whether there has been one or five checkpoints between when it was forcibly yielded by trio (checkpoint) and when it is ran again.

Aug 04 '22 13:08 Sxderp

Could you clarify? What in the docs does not reflect the way it works? (I have no idea what you mean by "when it was forcibly yielded by trio".

Aug 04 '22 15:08 agronholm

From the docs on checkpoints.

s a point where Trio checks for cancellation.
It’s a point where the Trio scheduler checks its scheduling policy to see if it’s a good time to switch to another task, and potentially does so.

I'm talking about point 2. When trio switches tasks it's the same as forcibly yielding CPU to another task. My wording was a poor way to say "schedule something else".

If your task hits a checkpoint I do not see how it can tell the difference between hitting one checkpoint and having one other task scheduled / run or if multiple checkpoints have been hit and many tasks have been scheduled / run. To the current task this is invisible, is it not?

Whether one checkpoint is hit or multiple the result is the same, the task is rescheduled to run at a later, indeterminate (ish), time.

Despite the number of checkpoints being invisible to any particular task I think the docs should reflect the lack of an upper bound (state "at least").

Aug 04 '22 17:08 Sxderp

I'm sorry but I still don't have the faintest clue on what you're talking about. What is this "upper bound" you're talking about? And why is point 2 wrong? It seems quite correct to me. Why would a task have to care about how many checkpoints it's hitting, so long as it's hitting at least one without consuming a lot of CPU time first?

Aug 04 '22 19:08 agronholm

Why would a task have to care about how many checkpoints it's hitting, so long as it's hitting at least one without consuming a lot of CPU time first?

Correct and we agree about this (at first I thought you didn't, which is why I rambled on, sorry). However, despite the task not caring about how many checkpoints occur the documentation should still be clear about what expectations there are in regard to occurrence.

My issue, and OP issue, is the semantics of the written documentation in regard to iterators. Re:

... then there will be at least one checkpoint before each iteration of the loop and one checkpoint after the last iteration.

This piece of documentation puts an upper bound of exactly one checkpoint. While a task shouldn't care if there is one, two, five, ten (we agree on this) the documentation should be consistent in what it says. In Nathaniel's initial response he specifically said "at least one" and I believe the documentation should be updated to reflect "at least one".

Aug 04 '22 19:08 Sxderp

Ok, so if I translate the documentation into an example, it should look like this:

async def iterate(values: list):
    await checkpoint()
    for value in values:
        yield value
        await checkpoint()  # or several of these

I think I finally understood this – a simple wording issue. Thanks for taking the time to explain :smile: The documentation was probably worded that way because trio's own async iterators don't have multiple checkpoints at the end (no, I haven't actually checked), but I agree that the wording should likely be updated.

Aug 04 '22 20:08 agronholm

trio trio copied to clipboard

Clarify docs on checkpoints in async iterators

trio
trio copied to clipboard