serenity icon indicating copy to clipboard operation
serenity copied to clipboard

`CacheRef` does not document how it should not be carried over an await point

Open Gaming32 opened this issue 2 months ago • 6 comments

If a channel is created using a GuildRef from a task other than the gateway task (e.g. an event handler task), a deadlock can ensue. This is because GuildRef wraps Ref. If the GuildRef is still being held when the Channel Create event is received, then serenity will attempt to update the Guild in the cache with new channels. This will call get_mut on the guilds DashMap, which may deadlock if a Ref is already held, thus causing a deadlock if the two tasks happen to be running on the same system thread at that particular time (which depends on tokio internals).

The root cause here is attempting to hold an equivalent of a sync lock across await boundaries. I don't really see a good fix for the root cause, but it may be good to keep an eye on https://github.com/xacrimon/dashmap/issues/150.

Gaming32 avatar Sep 30 '25 17:09 Gaming32

This is impossible to trigger in reality right, because CacheRef is !Send and therefore you cannot hold it over .await?

GnomedDev avatar Sep 30 '25 19:09 GnomedDev

Well I was holding it over await and this issue was occurring.

Gaming32 avatar Sep 30 '25 20:09 Gaming32

I'm also getting what appears to be a deadlock somewhere around the threads API, and I believe it's on get_archived_public_threads (judging by the logs of my app).

davidgomesdev avatar Oct 04 '25 19:10 davidgomesdev

@Gaming32 If I'm understanding from your code, you avoid the deadlock by always getting the guild channel? Rather than re-using one already gotten?

davidgomesdev avatar Oct 04 '25 20:10 davidgomesdev

I can see the problem here. Usually, CacheRef will prevent you holding any reference to the cache over an await point, which avoids any deadlocking... but you are running in an unusual way where you are performing this cache lookup in tokio::main, which is the one place tokio will let you run an async function that holds !Send over an await point (as you cannot spawn a task that holds a !Send value over an await point.

Since this is an unusual configuration, and CacheRef otherwise is such an improvement to the serenity API, I am going to say this is a documentation issue, not a bug in serenity. We need to document that CacheRef should not be held over an await, even though the compiler will stop you in 99% of times.

GnomedDev avatar Oct 04 '25 21:10 GnomedDev

@Gaming32 If I'm understanding from your code, you avoid the deadlock by always getting the guild channel? Rather than re-using one already gotten?

Correct. This is fairly cheap since it's just a cache lookup.

I can see the problem here. Usually, CacheRef will prevent you holding any reference to the cache over an await point, which avoids any deadlocking... but you are running in an unusual way where you are performing this cache lookup in tokio::main, which is the one place tokio will let you run an async function that holds !Send over an await point (as you cannot spawn a task that holds a !Send value over an await point.

Since this is an unusual configuration, and CacheRef otherwise is such an improvement to the serenity API, I am going to say this is a documentation issue, not a bug in serenity. We need to document that CacheRef should not be held over an await, even though the compiler will stop you in 99% of times.

Sounds reasonable.

Gaming32 avatar Oct 04 '25 21:10 Gaming32