osu icon indicating copy to clipboard operation
osu copied to clipboard

API request failures caused by flushing queue on logout don't have the same thread safety guarantees as normal failures

Open ppy-sentryintegration[bot] opened this issue 1 year ago • 1 comments

Originally spotted in https://github.com/ppy/osu/pull/28991.

If the user logs out with several API requests still in queue (still not yet started), these web requests will not receive an APIAccess instance to schedule callbacks back to:

https://github.com/ppy/osu/blob/4ee5a1225e6e6578b717a5a64987130042b1ef19/osu.Game/Online/API/APIRequest.cs#L112

This means that failure callbacks from the flush-cancellation will not be scheduled back to the update thread:

https://github.com/ppy/osu/blob/4ee5a1225e6e6578b717a5a64987130042b1ef19/osu.Game/Online/API/APIRequest.cs#L180-L183

which makes essentially all Failure callbacks essentially not safe if they do anything like transform mutation, because the callback will not be automagically scheduled onto the update thread like it normally is.

@ppy/team-client Would appreciate thoughts on how to handle.

Sentry Issue: OSU-1J8M

osu.Framework.Graphics.Drawable+InvalidThreadForMutationException: Cannot mutate the Transforms on a Loaded Drawable while not on the update thread. Consider using Schedule to schedule the mutation operation.
  ?, in void Drawable.EnsureMutationAllowed(string action)
  ?, in void Transformable.AddTransform(Transform transform, ulong? customTransformID)
  ?, in TransformSequence<TThis> TransformableExtensions.TransformTo<TThis>(TThis t, Transform transform)
  ?, in TransformSequence<T> TransformableExtensions.FadeTo<T, TEasing>(T drawable, float newAlpha, double duration, in TEasing easing)
  ?, in TransformSequence<T> TransformableExtensions.FadeOut<T, TEasing>(T drawable, double duration, in TEasing easing)
...
(14 additional frame(s) were not displayed)

The sentry issue creation thing conveniently omitted these key stack frames from the trace in OP:

void APIRequest.Fail(Exception e)
void APIAccess.flushQueue(bool failOldRequests)
void APIAccess.handleFailure()
void APIAccess.handleWebException(WebException we)
bool APIAccess.handleRequest(APIRequest req)
void APIAccess.processQueuedRequests()
void APIAccess.run()

bdach avatar Aug 21 '24 10:08 bdach

I would say that the callbacks should always be scheduled locally in general, because the most that can be guaranteed (assuming this issue is fixed) is that the request is scheduled somewhere on the update thread, but if that is in OsuGame then whatever component is responding to the callback could already be disposed.

smoogipoo avatar Aug 29 '24 14:08 smoogipoo

I would say that the callbacks should always be scheduled locally in general

I can't say that I agree with that, in fact I'm almost willing to wager that it will cause silent breakage in places that cannot be easily resolved due to presence hijinks or whatever. Also I'm pretty sure that APIAccess was at one point very raw (callbacks executed on API thread by default) and that was changed for reasons I can't recall but wouldn't be surprised to be the exact same as above.

If we're worried about scheduling callbacks to disposed objects then I'd either add local IsDisposed checks to the callbacks or try some higher-level solutions wherein APIAccess can know to skip those.

bdach avatar Aug 30 '24 08:08 bdach

I think we want to guarantee that failures run on APIAccess' schedule, and we should just get the API to the request earlier in the process (ie. at the point of queueing, rather than Performing).

I don't think this touches on the contentious part discussed above so shall I go ahead with this?

peppy avatar Aug 30 '24 08:08 peppy

No objections here

bdach avatar Aug 30 '24 08:08 bdach