trio icon indicating copy to clipboard operation
trio copied to clipboard

`sleep()` hangs indefinitely when using a custom clock

Open fjarri opened this issue 1 month ago • 13 comments

Tested on 0.32 and the current master.

MRE:

import trio
import trio.testing


class FooMockClock(trio.abc.Clock):
    def __init__(self) -> None:
        self._clock = trio.testing.MockClock(autojump_threshold=0)

    def current_time(self) -> float:
        return self._clock.current_time()

    def deadline_to_sleep_time(self, deadline: float) -> float:
        return self._clock.deadline_to_sleep_time(deadline)

    def start_clock(self) -> None:
        self._clock.start_clock()


async def main():
    print(trio.current_time())
    await trio.sleep(2)
    print(trio.current_time())


if __name__ == '__main__':
    print("MockClock")
    trio.run(main, clock=trio.testing.MockClock(autojump_threshold=0))
    print("FooMockClock")
    trio.run(main, clock=FooMockClock())

Output:

MockClock
0.0
2.0
FooMockClock
0.0
< hangs indefinitely >

I expected the second run() to behave exactly the same as the first. What am I doing wrong?

Also, if I copy-paste trio.testing.MockClock verbatim to this script and use that, the execution crashes - may be related:

Traceback (most recent call last):
  File "/Users/bogdan/wb/repos/trio/src/trio/_core/_run.py", line 2813, in unrolled_run
    assert isinstance(runner.clock, _core.MockClock)
AssertionError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/bogdan/wb/github/nucypher-async/t_new.py", line 185, in <module>
    trio.run(main, clock=MockClock(autojump_threshold=0))
  File "/Users/bogdan/wb/repos/trio/src/trio/_core/_run.py", line 2538, in run
    timeout = gen.send(next_send)
  File "/Users/bogdan/wb/repos/trio/src/trio/_core/_run.py", line 2951, in unrolled_run
    raise TrioInternalError("internal error in Trio - please file a bug!") from exc
trio.TrioInternalError: internal error in Trio - please file a bug!
Exception ignored in: <coroutine object Runner.init at 0x106a26500>
Traceback (most recent call last):
  File "/Users/bogdan/wb/repos/trio/src/trio/_core/_run.py", line 2156, in init
  File "/Users/bogdan/wb/repos/trio/src/trio/_core/_run.py", line 1112, in __aexit__
  File "/Users/bogdan/wb/repos/trio/src/trio/_core/_run.py", line 1288, in _nested_child_finished
  File "/Users/bogdan/wb/repos/trio/src/trio/_core/_run.py", line 1252, in _add_exc
  File "/Users/bogdan/wb/repos/trio/src/trio/_core/_run.py", line 933, in _cancel
  File "/Users/bogdan/wb/repos/trio/src/trio/_core/_run.py", line 501, in recalculate
  File "/Users/bogdan/wb/repos/trio/src/trio/_core/_run.py", line 1671, in _attempt_delivery_of_any_pending_cancel
  File "/Users/bogdan/wb/repos/trio/src/trio/_core/_run.py", line 1644, in _attempt_abort
  File "/Users/bogdan/wb/repos/trio/src/trio/_core/_io_kqueue.py", line 162, in abort
  File "/Users/bogdan/wb/repos/trio/src/trio/_core/_io_kqueue.py", line 184, in abort
ValueError: I/O operation on closed kqueue object
Exception ignored in: <function Nursery.__del__ at 0x105b208b0>
Traceback (most recent call last):
  File "/Users/bogdan/wb/repos/trio/src/trio/_core/_run.py", line 1486, in __del__
AssertionError: 
Exception ignored in: <coroutine object main at 0x106a26570>
Traceback (most recent call last):
  File "/Users/bogdan/wb/github/nucypher-async/t_new.py", line 180, in main
  File "/Users/bogdan/wb/repos/trio/src/trio/_timeouts.py", line 111, in sleep
  File "/Users/bogdan/wb/repos/trio/src/trio/_timeouts.py", line 91, in sleep_until
  File "/Users/bogdan/wb/repos/trio/src/trio/_core/_run.py", line 714, in __exit__
  File "/Users/bogdan/wb/repos/trio/src/trio/_core/_run.py", line 640, in _close
  File "/Users/bogdan/wb/repos/trio/src/trio/_core/_run.py", line 2993, in current_task
RuntimeError: must be called from async context
Exception ignored in: <function Nursery.__del__ at 0x105b208b0>
Traceback (most recent call last):
  File "/Users/bogdan/wb/repos/trio/src/trio/_core/_run.py", line 1486, in __del__
AssertionError: 
bogdan@lair ~/w/g/nucypher-async (todo-party) [1]> python t_new.py
0.0
Traceback (most recent call last):
  File "/Users/bogdan/wb/repos/trio/src/trio/_core/_run.py", line 2813, in unrolled_run
    assert isinstance(runner.clock, _core.MockClock)
AssertionError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/bogdan/wb/github/nucypher-async/t_new.py", line 186, in <module>
    trio.run(main, clock=MockClock(autojump_threshold=0))
  File "/Users/bogdan/wb/repos/trio/src/trio/_core/_run.py", line 2538, in run
    timeout = gen.send(next_send)
  File "/Users/bogdan/wb/repos/trio/src/trio/_core/_run.py", line 2951, in unrolled_run
    raise TrioInternalError("internal error in Trio - please file a bug!") from exc
trio.TrioInternalError: internal error in Trio - please file a bug!
Exception ignored in: <coroutine object Runner.init at 0x106ade500>
Traceback (most recent call last):
  File "/Users/bogdan/wb/repos/trio/src/trio/_core/_run.py", line 2156, in init
  File "/Users/bogdan/wb/repos/trio/src/trio/_core/_run.py", line 1112, in __aexit__
  File "/Users/bogdan/wb/repos/trio/src/trio/_core/_run.py", line 1288, in _nested_child_finished
  File "/Users/bogdan/wb/repos/trio/src/trio/_core/_run.py", line 1252, in _add_exc
  File "/Users/bogdan/wb/repos/trio/src/trio/_core/_run.py", line 933, in _cancel
  File "/Users/bogdan/wb/repos/trio/src/trio/_core/_run.py", line 501, in recalculate
  File "/Users/bogdan/wb/repos/trio/src/trio/_core/_run.py", line 1671, in _attempt_delivery_of_any_pending_cancel
  File "/Users/bogdan/wb/repos/trio/src/trio/_core/_run.py", line 1644, in _attempt_abort
  File "/Users/bogdan/wb/repos/trio/src/trio/_core/_io_kqueue.py", line 162, in abort
  File "/Users/bogdan/wb/repos/trio/src/trio/_core/_io_kqueue.py", line 184, in abort
ValueError: I/O operation on closed kqueue object
Exception ignored in: <function Nursery.__del__ at 0x105bd8820>
Traceback (most recent call last):
  File "/Users/bogdan/wb/repos/trio/src/trio/_core/_run.py", line 1486, in __del__
AssertionError: 
Exception ignored in: <coroutine object main at 0x106ade570>
Traceback (most recent call last):
  File "/Users/bogdan/wb/github/nucypher-async/t_new.py", line 181, in main
  File "/Users/bogdan/wb/repos/trio/src/trio/_timeouts.py", line 111, in sleep
  File "/Users/bogdan/wb/repos/trio/src/trio/_timeouts.py", line 91, in sleep_until
  File "/Users/bogdan/wb/repos/trio/src/trio/_core/_run.py", line 714, in __exit__
  File "/Users/bogdan/wb/repos/trio/src/trio/_core/_run.py", line 640, in _close
  File "/Users/bogdan/wb/repos/trio/src/trio/_core/_run.py", line 2993, in current_task
RuntimeError: must be called from async context
Exception ignored in: <function Nursery.__del__ at 0x105bd8820>
Traceback (most recent call last):
  File "/Users/bogdan/wb/repos/trio/src/trio/_core/_run.py", line 1486, in __del__
AssertionError: 

fjarri avatar Dec 08 '25 23:12 fjarri

Note while debugging, this trio.abc.Clock adapted from _run.py seems to work:

class SystemClock(trio.abc.Clock):
    def start_clock(self) -> None:
        pass

    def current_time(self) -> float:
        return time.perf_counter()

    def deadline_to_sleep_time(self, deadline: float) -> float:
        return deadline - self.current_time()

It looks like if current_time raises an error, this turns into a TrioInternalError which is not ideal! But it shows the error too.

A5rocks avatar Dec 08 '25 23:12 A5rocks

I'm fairly certain the root cause is that we don't actually allow MockClock-like clocks, so to accomplish it we do some funny things. Specifically, we tightly couple it with the loop, i.e. we have MockClock set an attribute on the loop so the loop knows it has a MockClock and not anything else. So in this case, here's what I think is happening:

  1. the loop calls deadline_to_sleep_time
  2. this calls FooMockClock.deadline_to_sleep_time, which in turn calls MockClock.deadline_to_sleep_time
  3. since rate wasn't passed to MockClock's constructor, MockClock.deadline_to_sleep_time will return 999999999... so the loop will sleep 999999999 seconds
  4. this is fine normally because following code, but in this case it doesn't apply (since runner.clock is not self)
            if runner.clock is self:
                runner.clock_autojump_threshold = self._autojump_threshold
  1. so the loop is just sleeping 999999999 seconds...

I haven't fully traced through everything so it's possible I missed something... but that seems reasonable to me. A fix is probably to check that runner.clock_autojump_threshold exists in deadline_to_sleep_time if rate == 0.0. It's not an actual error in Trio, but it's nice to avoid confusing people!

tl;dr Your custom clock is wrong which is causing long sleeps. However, Trio could and should help debug issues like this.

A5rocks avatar Dec 09 '25 00:12 A5rocks

tl;dr Your custom clock is wrong

I would disagree with this assessment - my clock correctly impls abc.Clock, so I've done my part. If there's something else the clock should do, it should be in the ABC as well.

This part

            if runner.clock is self:
                runner.clock_autojump_threshold = self._autojump_threshold

seems hacky to me. Why have a separate runner.clock_autojump_threshold when it's a property of the clock? Should there be instead an autojump_threshold property in the ABC which the runner will use?

fjarri avatar Dec 09 '25 00:12 fjarri

Because designing a public API to satisfy all MockClock-shaped usages is challenging (especially with n=1 examples to draw from)! Your clock is wrong in that you're using an API (MockClock) wrong within it, rather than the interface it exposes.

What specific case are you trying to extend MockClock to do? See https://github.com/python-trio/trio/issues/1587 for why runner.clock_autojump_threshold is a thing rather than something more public.

A5rocks avatar Dec 09 '25 00:12 A5rocks

Your clock is wrong in that you're using an API (MockClock) wrong within it

Well, that's technically true, but there's nothing in the documentation about deadline_to_sleep_time() needing to use some undocumented ways to get the current runner and then setting an attribute in it.

What specific case are you trying to extend MockClock to do?

I have code that works with real time extensively (creation/expiration dates, internal timeouts for recurring tasks etc). So I cannot use trio.current_time(). I also want to test it with an autojump clock, so I can't just query system time. So I pass a clock object as DI, which allows me to choose between the two. Now in tests, I don't want to have two fixtures, pytest_trio.autojump_clock, and my mock clock object, which essentially do the same thing and need to be always used together. So I want my mock clock object to impl trio.abc.Clock, so that pytest_trio could pick it up and pass to trio.run().

To summarize, there is a workaround for what I want to do, it'll just make the code more prone to future errors. But I do think that either the attribute setting hack should be fixed, or abc.Clock should be marked as private (if indeed the attribute setting is unavoidable), because as it stands it cannot be implemented by an external user.

See https://github.com/python-trio/trio/issues/1587 for why runner.clock_autojump_threshold is a thing rather than something more public.

It seems that the issue listed exposing autojump_threshold as one of the options. And it is not like it's some internal thing - there is a description in the docs about what it does, and the user is allowed to change it (as a constructor argument, and later in a setter). The attribute is already public in MockClock, why not make it public in abc.Clock?

fjarri avatar Dec 09 '25 01:12 fjarri

But I do think that either the attribute setting hack should be fixed, or abc.Clock should be marked as private (if indeed the attribute setting is unavoidable), because as it stands it cannot be implemented by an external user.

Hmm, I don't quite follow. It seems possible to implement the abc.Clock interface, just not to do something like MockClock. I guess the docs are a little confusing about this (they do say "feel the need to take direct control over the PASSAGE OF TIME ITSELF" but that's more a joke, IMO :-).

I also want to test it with an autojump clock, so I can't just query system time.

I think a rate=1.0 MockClock uses system time? I guess you mean that the current_time() will be wrong if an autojump happens? I don't think that would be fixable with any proposed API, as I assume the loop checks current_time on wakeup, rather than taking on faith that the deadline has expired. Otherwise, it sounds to me like you can use this clock at runtime:

class SystemClock(trio.abc.Clock):
    def start_clock(self) -> None:
        pass

    def current_time(self) -> float:
        return time.perf_counter()

    def deadline_to_sleep_time(self, deadline: float) -> float:
        return deadline - self.current_time()

And then use MockClock(rate=1.0, autojump_threshold=0) at test time...

Let me know if I'm misunderstanding your use case!

A5rocks avatar Dec 09 '25 01:12 A5rocks

It seems possible to implement the abc.Clock interface, just not to do something like MockClock.

Well, what is possible then? I find the behavior where I literally just pass through all the ABC calls to MockClock, and the result does not work, very surprising. It means the ABC is not actually A.

I think a rate=1.0 MockClock uses system time?

It uses the system time rate, but trio.current_time() still starts from 0. I don't need just real time intervals, I need real absolute dates in production.

Otherwise, it sounds to me like you can use this clock at runtime

The code you provided does not have the autojump behavior, that is if I do trio.sleep(), it actually waits the requested period in real time.

fjarri avatar Dec 09 '25 01:12 fjarri

It uses the system time rate, but trio.current_time() still starts from 0. I don't need just real time intervals, I need real absolute dates in production.

Ah, sorry, I misread the code. You can work around this with:

clock = trio.testing.MockClock(rate=1.0)
clock._real_base = 0
# alternative: `clock.jump(time.perf_counter())`, though this will be very slightly off...
trio.run(..., clock=clock)

but that's certainly annoying. I'd be happy to add a parameter to MockClock


However, I still don't really get the use case here. So you can't use trio.current_time() because it's wrong w/r/t real time? If so, you can replace the clock such that it is real time! The only problem is, yeah, MockClock will be wrong. But you can use the above if that's the only issue. I don't really get the thing about passing around two things or whatever.

A5rocks avatar Dec 09 '25 01:12 A5rocks

I think I can be a bit clearer. Here's a table where what Trio provides works without workarounds:

in which scenario auto jump accurate time
test x
production x

In this case, you would use SystemClock (above) in production and MockClock in test.

Here's a table where what Trio provides works, but in an annoying way (my latest comment is regarding this):

in which scenario auto jump accurate time
test x x
production x

In this case, you would use SystemClock in production and MockClock with the above workaround in test. I think we can improve this by providing MockClock(based_on_real_time=True) or something.

A5rocks avatar Dec 09 '25 02:12 A5rocks

The ABC itself is A, the problem is that the specific concrete implementation in trio.testing.MockClock cheats and uses some internal private APIs beyond those provided by the ABC. And it turns out those weren't written carefully enough to let you use it inside the implementation of your own clock. Whoops.

It's actually possible to implement MockClock with public APIs (the original version worked that way), but as described in https://github.com/python-trio/trio/issues/1587 that had some tricky edge cases that were easiest to fix by cheating. That's one direction you could take.

Another option would be to fix the cheating APIs so that you can use them too without cheating.

On Mon, Dec 8, 2025, 18:26 A5rocks @.***> wrote:

A5rocks left a comment (python-trio/trio#3369) https://github.com/python-trio/trio/issues/3369#issuecomment-3629924774

I think I can be a bit clearer. Here's a table where what Trio provides works without workarounds: in which scenario auto jump accurate time test x production x

In this case, you would use SystemClock (above) in production and MockClock in test.

Here's a table where what Trio provides works, but in an annoying way (my latest comment is regarding this): in which scenario auto jump accurate time test x x production x

In this case, you would use SystemClock in production and MockClock with the above workaround in test.

— Reply to this email directly, view it on GitHub https://github.com/python-trio/trio/issues/3369#issuecomment-3629924774, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAEU42C3SWFEWKOYJKFY3UT4AYXMDAVCNFSM6AAAAACON2C5XOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTMMRZHEZDINZXGQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>

Nathaniel J. Smith -- https://vorpus.org http://vorpus.org

njsmith avatar Dec 09 '25 05:12 njsmith

I don't really get the thing about passing around two things or whatever.

I've got a Server that takes a clock object on initialization (because as I explained above I can use neither trio.current_time() nor system time directly). So every test has to have two fixtures:

def test_something(autojump_clock, my_clock, ...):
    server = Server(clock=my_clock)

autojump_clock comes from pytest_trio and sets the clock of trio.run(). I want my_clock to be passed to trio.run() as well, so that I don't need to mention autojump_clock. For that it has to derive from trio.abc.Clock, which is currently impossible to implement correctly (using only public API).

Using trio.current_time() inside and passing a special clock to trio.run() in production is not a good solution, since it's easy to forget to do it - a user running a Server in their own event loop has to remember this caveat.

And I find that the conversation is being steered into people trying to explain to me how I don't really want to do what I want to do, whereas my point is that trio.abc.Clock is currently incomplete and dangerous to use, so it needs to be either amended, or marked as private. I say it with all possible respect (re-reading my previous messages, they are somewhat abrasive), as I have been enjoying using trio for years now, and hope to continue doing so.

Another option would be to fix the cheating APIs so that you can use them too without cheating.

That would be ideal, and I can try and come up with a PR.

Edit: see #3371 as a draft of the proposed solution

fjarri avatar Dec 09 '25 22:12 fjarri

For that it has to derive from trio.abc.Clock, which is currently impossible to implement correctly (using only public API).

I'm a bit confused about what you mean here. I assume you mean that trio.testing.MockClock's autojump functionality cannot be implemented correctly? (though it does sound like you are talking about how trio.abc.Clock cannot be done. I think you might be misunderstanding the [current] model for it...)

[re: your draft/PR]

Personally, in ranked order, I prefer:

  1. extend trio.testing.MockClock to be based on system time (i.e. add a kwarg)
  2. expose the attribute on the runner (actually, now I see the issue here... jump would have to be a thing :/)
  3. expose real_base on trio.testing.MockClock
  4. updating the trio.abc.Clock interface

A couple reasons against (4):

  • it's a breaking change (at least how you implemented it)
  • every clock will have to become aware of it
    • what does autojump even mean if I have a clock based on e.g. the computer's clock speed? (I can't come up with good custom clocks, but this is general)

EDIT: I thought more and realized everything I can think of kinda sucks. (except (1), since that lets us delay solving this problem)

a user running a Server in their own event loop has to remember this caveat.

Thanks, this is the bit I was missing.


I hadn't quite realized what you were doing... I suppose your Server is calling clock.current_time() instead of trio.current_time()?

A5rocks avatar Dec 10 '25 02:12 A5rocks

I'm a bit confused about what you mean here. I assume you mean that trio.testing.MockClock's autojump functionality cannot be implemented correctly?

Yes, you're right. MockClock is impossible to extend (or reimplement).

it's a breaking change (at least how you implemented it)

It doesn't have to be, autojump() and autojump_threshold() can have default implementations (do nothing, and return inf, respectively). Although seeing as how trio has breaking releases every few months, with no releases in-between, I didn't think it would be a big deal.

what does autojump even mean if I have a clock based on e.g. the computer's clock speed?

I think that mostly indicates a problem with the name... It's more of a deadline_to_jump(), and should take a deadline argument (that is, the runner will pass it, instead of MockClock querying it internally from _core.current_statistics()). Naturally, for a "real" clock it is a no-op. It also feels to me that three methods (deadline_to_sleep_time(), autojump(), autojump_threshold()) is too many, since their responsibilities intersect, but I need to think about it more. And we'd be getting into actual breaking change territory.

I hadn't quite realized what you were doing... I suppose your Server is calling clock.current_time() instead of trio.current_time()?

Yes, exactly.

fjarri avatar Dec 10 '25 07:12 fjarri