Async/Await for Signals
It should be possible to implement Future for signals, and add support for async functions in exported methods, with them returning a signal.
These two things combined should allow us to use async/await syntax when calling methods both in rust and gdscript.
This is going to require a lot of upfront work before it's feasible to start working on this feature. First an actual implementation of signals are needed, but we'll also likely need to create some godot-objects directly in our bindings that will handle the awaiting and execution of these futures, and that is not something we have planned/thought out how to do yet.
I implemented a POC for a SignalFuture and minimal async runtime that allows to await signals. https://github.com/godot-rust/gdext/commit/5891c5bb2a2ca15e9a1ba904797bda608dcc997d
It still has two issues which I did not bother to resolve yet:
- ~~it's currently not possible to spawn nested tasks. It should be possible to resolve that, but I'm not sure how important it actually is.~~ EDIT: has been resolved.
- Only Signal arguments that are
Sync + Sendcan currently be used in the async context. This makes it impossible to access anyGd<T>coming from a signal.
@TitanNano I think your implementation doesn't work correctly if the Callable is called on another thread than where the task has been created.
https://github.com/jrb0001/gdext/commit/74afb5b4b58a0c1cfee6e7f1d9a48a22dc57d9b0 is a variant without the thread-local, but it needs some unsafe and panics when called from another thread. It should also fix recursively creating tasks, but I didn't test it.
https://github.com/jrb0001/gdext/commit/a27d0bc384e27238410df7ac6200ed6874bfca0c is a third, more complex variant which has both !Send and Send futures. It can only be called by another thread if both the task and the future are Send. Otherwise it panics or doesn't even compile.
Sync isn't needed here because polling a Future needs a &mut so only one thread can access it at a time anyway.
@jrb0001 yes that is true. I ignored that so far. Have you encountered a case were the callable was called on an other thread?
https://github.com/jrb0001/gdext/commit/74afb5b4b58a0c1cfee6e7f1d9a48a22dc57d9b0 is a variant without the thread-local, but it needs some unsafe and panics when called from another thread. It should also fix recursively creating tasks, but I didn't test it.
I don't see the benefit of this if it still panics.
https://github.com/jrb0001/gdext/commit/a27d0bc384e27238410df7ac6200ed6874bfca0c is a third, more complex variant which has both !Send and Send futures. It can only be called by another thread if both the task and the future are Send. Otherwise it panics or doesn't even compile.
Sync isn't needed here because polling a Future needs a &mut so only one thread can access it at a time anyway.
Gd<T> is neither Send nor Sync so as soon as either of them is required the future becomes much less useful since you can't access any Gd<T> inside of it.
@jrb0001 yes that is true. I ignored that so far. Have you encountered a case were the callable was called on an other thread?
No, I spotted it while reading through it.
jrb0001@74afb5b is a variant without the thread-local, but it needs some unsafe and panics when called from another thread. It should also fix recursively creating tasks, but I didn't test it.
I don't see the benefit of this if it still panics.
The panic is guaranteed while your version can run another future which was assigned the same ID on the thread of the signal.
jrb0001@a27d0bc is a third, more complex variant which has both !Send and Send futures. It can only be called by another thread if both the task and the future are Send. Otherwise it panics or doesn't even compile.
Sync isn't needed here because polling a Future needs a &mut so only one thread can access it at a time anyway.
Gd<T>is neitherSendnorSyncso as soon as either of them is required the future becomes much less useful since you can't access anyGd<T>inside of it.
LocalSignalFuture of the third variant works with Gd<T> both as signal argument and when captured by a closure with the limitation that the future is !Send and panics if the signal happens on another thread. I think that solves the main usecase?
SendSignalFuture of the third variant is less useful, I agree. Maybe it makes sense to merge them and select the implementation based on the experimental-threads feature?
The panic is guaranteed while your version can run another future which was assigned the same ID on the thread of the signal.
This can easily be solved by adding the ThreadId into the Waker.
LocalSignalFuture of the third variant works with Gd<T> both as signal argument and when captured by a closure with the limitation that the future is !Send and panics if the signal happens on another thread. I think that solves the main usecase?
But it adds a lot of complexity without solving any problems (at least as far as I'm aware). Correctly, panicking can be solved like I mentioned above.
SendSignalFuture of the third variant is less useful, I agree. Maybe it makes sense to merge them and select the implementation based on the experimental-threads feature?
Even with extermental-threads Gd<T> is still not thread-safe. Without access to engine managed objects, I don't see much value for a Send + Sync future. But we can even offer a thread-safe task spawn function which either moves the future into the Waker or uses a static (i.e. not thread-local) ASYNC_RUNTIME.
I have been thinking about the stated concern of signals being emitted on other threads, and I believe the correct solution would be to redirect the wakers future polling into the correct thread. Unfortunately, this is not possible as far as I can see. Godot does not expose any way to queue callable for a specific thread. Generally, there is also too little control over where things get executed to come up with a reliable approach.
So the alternatives for now that I can see are:
a) limit async tasks to the main thread by panicking when trying to spawn a task on another thread. With this limitation in place, we can connect to signals in deferred mode inside the Signal Future and reliably run the callable on the main-thread.
b) properly panic when the callable runs on a different thread than the future and leave the rest up to the user. This is more flexible but much less predictable, so I think a) is probably preferred.
Once there is a thread-safe version of Gd<T> we can add a way to spawn Send + Sync futures and poll them on whatever thread the signal is emitted. This should be quite trivial.
There are seem to be some issues related to emitting a signal from a callable connected to itself. This will call the Callable a second time and its Mutex is already locked. From looking at Godot source, it seems that oneshot connections are disconnected after calling the Callable. So it needs to either ignore calls while the Mutex is locked or use deferred mode.
I also tried implementing futures_lite::Stream and the only way to avoid the issue there is to use deferred mode, as far as I can see.
So if deferred mode signals always run on the main thread, then that's probably the best option.
I also ran into an issue with the async-channel crate which deadlocks when the receiving Future is polled from inside the Waker (also caused by a Mutex). std::task documentation doesn't prohibit our behavior, but I wouldn't be surprised if other crates have similar issues. There can be quite some recursion if there is a chain of futures, each doing something that wakes up the next one. I think it would be better to simply avoid calling user code from inside the waker, if that's even possible.
I ended up using an Autoload with a smol::LocalExecutor, ~10 lines. Maybe there is some other way to execute code every frame, without adding a node or registering a class?
There are seem to be some issues related to emitting a signal from a callable connected to itself. This will call the Callable a second time and its Mutex is already locked. From looking at Godot source, it seems that oneshot connections are disconnected after calling the Callable. So it needs to either ignore calls while the Mutex is locked or use deferred mode.
I think this was changed in 4.3. In 4.2 it's disconnected after calling and in 4.3 it's disconnected first.
I think I found another two issues that affect all our implementations:
- Dropping the
Futuredoesn't disconnect theCallable. - Disconnecting the
Callablecauses theFutureto get stuck forever.
Disconnecting requires the Callable so we need to keep it around somewhere in the Future. Which means that its internal refcount will never drop to zero so the Callable is never dropped. I don't know any other way to reliably detect a disconnected signal.
Disconnecting requires the Callable so we need to keep it around somewhere in the Future. Which means that its internal refcount will never drop to zero so the Callable is never dropped. I don't know any other way to reliably detect a disconnected signal.
We can use a WeakRef I think.
I also ran into an issue with the async-channel crate which deadlocks when the receiving Future is polled from inside the Waker (also caused by a Mutex). std::task documentation doesn't prohibit our behavior, but I wouldn't be surprised if other crates have similar issues. There can be quite some recursion if there is a chain of futures, each doing something that wakes up the next one. I think it would be better to simply avoid calling user code from inside the waker, if that's even possible.
We might want to connect to the signal in the normal way and schedule a deferred callable to actually poll the future. But I haven't tried it yet.
Disconnecting requires the Callable so we need to keep it around somewhere in the Future. Which means that its internal refcount will never drop to zero so the Callable is never dropped. I don't know any other way to reliably detect a disconnected signal.
We can use a
WeakRefI think.
Isn't that only for RefCounted? Unfortunately Callable are their own type, not even an Object.
I also ran into an issue with the async-channel crate which deadlocks when the receiving Future is polled from inside the Waker (also caused by a Mutex). std::task documentation doesn't prohibit our behavior, but I wouldn't be surprised if other crates have similar issues. There can be quite some recursion if there is a chain of futures, each doing something that wakes up the next one. I think it would be better to simply avoid calling user code from inside the waker, if that's even possible.
We might want to connect to the signal in the normal way and schedule a deferred callable to actually poll the future. But I haven't tried it yet.
I had that idea too but didn't find a generic way to schedule it. There is Callable::call_deferred but that isn't exposed to gdextension as far as I can see. Maybe something like callable.to_variant().call("call_deferred", args) but not sure.
Isn't that only for RefCounted? Unfortunately Callable are their own type, not even an Object.
Yeah right I missed that.
Disconnecting requires the Callable so we need to keep it around somewhere in the Future. Which means that its internal refcount will never drop to zero so the Callable is never dropped. I don't know any other way to reliably detect a disconnected signal.
Why do you believe it will never be dropped? As soon as the future is dropped, the callable should be dropped as well, shouldn't it?
I had that idea too but didn't find a generic way to schedule it. There is Callable::call_deferred but that isn't exposed to gdextension as far as I can see. Maybe something like callable.to_variant().call("call_deferred", args) but not sure.
Looks like Callable::call_deferred is only available as a varcall and those are not implemented for built-in types. I will see if your suggestion work and we should see that Callable::call_deferred is properly implemented as a varcall.
I added the handling of the following cases to my implementation:
-
godot_taskshould panic if the current thread is not the main thread. -
Waker::wakeshould panic if the current thread ID doesn't match the original one. - The future is now polled inside a deferred callable, so waking and polling are no longer happening in the same call stack. This way, signals emitted on a different thread should also be redirected to poll the future on the main thread now.
-
SignalFuturedisconnects its callable if the signal hasn't been emitted when the future is dropped.
https://github.com/godot-rust/gdext/compare/7634fe769d1fcb66209586f0b6c06aac40978253...4c2d9a85a0ea29fc0ade525e8538c216cd792c34?expand=1
EDIT: added handing for nested godot_tasks and clean-up when deinitializing. https://github.com/godot-rust/gdext/compare/7634fe769d1fcb66209586f0b6c06aac40978253...341a40c0a144e57edc8392090a55881b2b15f50a
Disconnecting requires the Callable so we need to keep it around somewhere in the Future. Which means that its internal refcount will never drop to zero so the Callable is never dropped. I don't know any other way to reliably detect a disconnected signal.
Why do you believe it will never be dropped? As soon as the future is dropped, the callable should be dropped as well, shouldn't it?
If the Future is dropped, then yes, it will work as intended. But if the signal is disconnected (freed object / through code), then the Future will wait for a wake-up which will never come.
If the Future is dropped, then yes, it will work as intended. But if the signal is disconnected (freed object / through code), then the Future will wait for a wake-up which will never come.
Yes, if someone goes out of their way to disconnect the callable, then the future will never complete. I don't see how this can be avoided. I wonder how this behaves in GDScript.
If something intentionally disconnect a callable it didn't connect, then we can blame that. But can we expect every code that calls Node::queue_free(), decrements a RefCounted or otherwise frees an object to check if there is any callable connected to any signal on that object?
It's possible to detect the freed object case by polling Signal::is_null() but that's too heavy to do by default.
I did a quick test and it looks like gdscript await also gets stuck, or at least doesn't continue with the next statement.
@jrb0001 I pretty much had the same ideas and came to the same conclusion. Getting a callback on object deletion is just not possible, and checking all pending futures every frame is excessive.
A middle ground solution that came to me could be: If the AsyncRuntime Vec is filled to capacity (.len() == .capacity()) when adding a new task, then we could go through the pending futures and try to garbage collect some of them before extending the capacity.
Alternatively, we could always check for dead futures during insert, since we already scan the buffer for empty slots anyway.
I don't think we can know when a future is dead, though. If we want to keep this runtime agnostic (and I strongy believe we should) we have no information when a future will stop making progress.
Best we can do is to return a TaskHandle from godot_task and let the caller manually cancel the task.
It looks like it is possible to use a new Callable for disconnecting, as long as they are equal and hash() returns the same value.
Signal disconnected or object freed:
- Connected
Callableis dropped automatically. - Notify
Future.
Future dropped:
- Do nothing if
signal.object().map(|o| o.is_instance_valid()) != Some(true)(freed object). - Use second
Callableto disconnect the signal. - Connected
Callableis dropped automatically.
Notify
Future.
How do you do this?
Future dropped:
- Do nothing if signal.object().map(|o| o.is_instance_valid()) != Some(true) (freed object).
- Use second Callable to disconnect the signal.
- Connected Callable is dropped automatically.
I already implemented this.
Notify
Future.How do you do this?
You probably need to change the future to return Option<T>/Result<T, _>, unless you want to let it panic on disconnect. Then inside the Drop impl just write into your state and wake the Waker, just like you do when called from Godot.
You probably need to change the future to return Option<T>/Result<T, _>, unless you want to let it panic on disconnect. Then inside the Drop impl just write into your state and wake the Waker, just like you do when called from Godot.
Let's assume we can create a custom callable which wakes the future both on invocation and on Drop. The callable also can be cloned and maintains the same hash, so we don't have to reference the engine callable in the future.
New references to the callable can still be created via signal.get_connections(). So Object.free() != drop(callable), we can't rely on the callable being dropped when the object is freed.
Or do you see another way to detect when the object is freed?
True, as long as some other code keeps the reference it got from signal.get_connections(), the Callable isn't dropped. Such code could also call it directly or connect it to some other signal and I think that could do more damage than just a stuck Future.
We can't guarantee much in presence of malicious code (except the basic rust guarantees), but in my opinion best-effort detection of stuck futures is better than not having anything at all.
The guaranteed way to detect a freed object is through something like this:
let object: InstanceId = signal.object_id().unwrap();
...;
let is_freed: bool = Gd::try_from_instance_id(object).is_err();
It requires polling which can get expensive if there are hundreds/thousands of waiting futures and it doesn't help in the signal.disconnect() case. The Drop impl approach is much cheaper and should be good enough in most situations.
I have now added:
- A
TaskHandlethat can be used to cancel a task and check if the task is still pending. - A second
GuaranteedSignalFuturethat resolves to anOption<R>and wakes the future if theCallableis dropped.
https://github.com/godot-rust/gdext/compare/7634fe769d1fcb66209586f0b6c06aac40978253...a9d4f0e5bd3f7d4ba6231f1dcc00145d4ae3723e
Let's see how well this all works. I'm also open to suggestions for a better name, for the second future.
@Bromeon are you interested in including this into the project? I will start testing this in my project now, but I happy to promote it to a PR.
Very cool to see progress on this front! :+1:
@TitanNano I'm trying to catch up, can you maybe summarize how the user API of this might look? Any notable trade-offs/downsides? Can it be feature-gated?
@Bromeon sure, so far, we have the following:
User-facing APIs
-
godot_task(...)This should be used to create a new async task on our runtime. TheTaskHandlecan be used to check if the task is still in progress and cancel it if desired. -
TaskHandle::is_pending()returns aboolindicating whether the task is still in progress or has been completed. -
TaskHandle::cancel()allows canceling a pending task. -
ToSingalFuture::to_future()creates a newSignalFuturefrom an existing signal. -
ToGuaranteedSignalFuture::to_guaranteed_future()creates a newGuaranteedSignalFuture. -
SignalFutureresolves when the corresponding signal is emitted. -
GuaranteedSignalFutureresolves when the corresponding signal is emitted or the object of the signal is freed. This future resolves to an option.
Trade-offs
-
godot_task(...)accepts any future and does not require them to beSend + Sync. This has been chosen to allow nonSend + Syncvalues to be moved into the future, likeGd<T>. The consequence of this is that it can only be used on the main-thread and will panic on any other thread. Unfortunately, signals can be emitted from any thread and the engine only provides APIs to move execution back to the main-thread. Our non thread safe futures have to remain on their origin thread, and we can only do this on the main-thread. When thread-safe alternatives forGd<T>and other types exist, we can use thread-safe futures and have more flexibility in how and where we execute them. -
SignalFuturecould potentially never resolve if the object of the signal is freed while the future exists. The corresponding task has to be canceled manually. -
GuaranteedSignalFuturealso resolves when the object of the signal is freed. This is not 100% guaranteed, though. Users can acquire a reference to the signal connection of the future viaSignal::get_connections()and the future will not resolve until all references to the connectedCallableare dropped. - Both futures can only resolve to
Send + Syncvalues. Signal arguments have to be passed through aMutexbecause of theSend + Syncconstraint ofCallables. This makes it impossible to get aGd<T>,DictionaryorArrayfrom a signal.
Feature Gate
All changes are self-contained, so feature gating them should not be a problem.
@Bromeon let me know if you got any questions 😃