Velocity icon indicating copy to clipboard operation
Velocity copied to clipboard

Nearly impossible to guarantee plugin messages will be observed in the correct order

Open lax1dude opened this issue 7 months ago • 8 comments

Requested Feature

The event bus needs a way to register a handler that is guaranteed to be executed on the calling thread, so that plugin messages and other packet-related events can be handled in the Netty event loop and be observed in the same order the packets are received. The async boolean in the @Subscribe annotation is not good enough since any async handler of a higher priority will cause the synchronous handler to be called after it in the async task thread pool instead of the thread that called the event, causing the order to be undefined.

My proposed solution to the problem that is backwards compatible is to add an additional flag to @Subscribe that forces the event to be synchronous and puts it in a separate list to be sorted independently from normal handlers, and then have it call that list of special synchronous handlers before calling any of the regular handlers.

Why is this needed?

Velocity's event bus defaults async for all event handlers, this means the order that events are observed in is basically undefined due to preemptive multitasking. The issue is that events are dispatched to multiple different threads in a thread pool, which means its basically up to the OS to decide which ones will actually be processed first. I've personally struggled with this, and unfortunately setting async to false in @Subscribe is not a solution, unless you also set the priority to the absolute maximum. Max priority is not really gonna guarantee that the events will be observed in the order they are dispatched either though unless you can guarantee no other plugin authors have also set the priority to the absolute maximum on an async handler, because Velocity might call the async handler first and process my synchronous one in the thread pool after the async one completes.

Alternative Solutions

No solutions are provided by the Velocity API, people must hack into the Netty pipeline to handle this properly.

Additional Information

No response

lax1dude avatar Apr 25 '25 21:04 lax1dude

The event bus needs a way to register a handler that is guaranteed to be executed on the calling thread, so that plugin messages and other packet-related events can be handled in the Netty event loop and be observed in the same order the packets are received.

You've written a lot of text but in all of this, you didn't articulate a reason for why you need this.

The async boolean in the @Subscribe annotation is not good enough since any async handler of a higher priority will cause the synchronous handler to be called after it in the async task thread pool instead of the thread that called the event, causing the order to be undefined

That's not the intended purpose of async, though. The purpose of async = true is to tell Velocity that you're a bad citizen who wants to do blocking things and to put you in your plugin's own thread pool so you can do blocking operations without abandon, and async = false tells us that you promise not to block in exchange for potentially being permitted to run on the event loop. But because there are a lot of "bad citizens", you have to be prepared to not be running in the event loop.

If the API does not provide a way to do this, then the documentation should be amended to say that the order events will be observed is undefined and impossible to guarantee when using Velocity.

We have never made guarantees about the total ordering of events. For that matter, neither does BungeeCord - it just so happens that as designed, you get total ordering of events within the context of a single network connection. Velocity has never made this guarantee in the first place. In fact, old versions of Velocity didn't even support running event handlers within the event loop.

I also am not a big fan of your proposal to split "must run in the event loop" from "could run in the event loop or run async", because it has one major flaw. Take this hypothetical, contrived example:

@Subscribe(mustRunInEventLoop = true)
public EventTask someEventHandler(SomeEvent e) {
    return EventTask.async(() -> logger.info("hi"));
}

What's the proper outcome of this?

  1. Disallow this.
  2. Permit it by deferring the async task execution for later.
  3. Add logic to allow "jumping" from an plugin thread pool back to the event loop.

(1) removes a major advantage of the Velocity event system. (2) breaks semantics. (3) would require some very unwelcome coupling between the event system and player connection handling.

Could we do this? Maybe. But I think it would be a pretty bad design choice for what I judge is likely to be a minority use case. (Not providing a use case where this is important behavior is also factoring into my decision making. I haven't operated a Minecraft server in several years at this point, but when I've done so, I've never had a case where we needed to do proxy-level packet manipulation.)

I'm currently leaning towards "won't fix". If you have a particularly compelling use case for this, we may consider it.

astei avatar Apr 25 '25 21:04 astei

You've written a lot of text but in all of this, you didn't articulate a reason for why you need this.

@astei Well if you must know exactly what I was doing to consider a feature, it got me when I tried to communicate with a paper server for something where the order of the plugin messages matter. I develop AOT-compiled HTML5 ports of old Java Edition versions and they currently use plugin messages for skins, voice chat, and a number of other features, making it necessary for certain plugin messages to be ordered correctly. I was trying to relay the voice chat packets through the backend paper server to allow multi-proxy setups to be created and the latency for that over localhost is low enough for preemptive multitasking to cause the events coming from the backend to be observed in the wrong order. I was not fully aware that async is true by default until I debugged it. I'm going to move away from plugin messages for communication with the client, but communication with the backend will probably continue to use plugin messages to avoid breaking the protocol or having to open some kind of side-channel.

That's not the intended purpose of async, though. The purpose of async = true is to tell Velocity that you're a bad citizen who wants to do blocking things and to put you in your plugin's own thread pool so you can do blocking operations without abandon, and async = false tells us that you promise not to block in exchange for potentially being permitted to run on the event loop. But because there are a lot of "bad citizens", you have to be prepared to not be running in the event loop.

Okay, noted.

We have never made guarantees about the total ordering of events. For that matter, neither does BungeeCord - it just so happens that as designed, you get total ordering of events within the context of a single network connection. Velocity has never made this guarantee in the first place. In fact, old versions of Velocity didn't even support running event handlers within the event loop.

Yeah, exactly, my issue is marked feature not bug. I'm proposing a feature to allow for total ordering in cases like this, since the only workaround I can think of is hacking into the Netty pipeline. I would've made this a bug report instead of a feature request if I'd seen anything that says total ordering is guaranteed.

I also am not a big fan of your proposal to split "must run in the event loop" from "could run in the event loop or run async", because it has one major flaw. Take this hypothetical, contrived example:

I was not aware of that feature, although I'm not exactly sure what's flawed with the "Disallow this" option for handlers annotated with a flag, since a theoretical mustRunInEventLoop would be false by default and only meant to be enabled in special cases where the entire point is for it to be handled synchronously, therefore the advantages of the existing event handling behavior will not be lost anywhere it is currently desired.

lax1dude avatar Apr 25 '25 22:04 lax1dude

A mustRunInEventLoop sounds super side-effecty, the entire event chains behavior would be changed by some plugin you as a dev don’t even care about, and you would suddenly need to care about that; I think that events for plugin messaging is a pretty fairly bad way of going about this and wonder if we should adopt an api that allows for plugins to exclusively handle plugin messages in their channel (1:1, no multi registration, etc) which would potentially disable the event from firing for registered channels (or, fire it as handled), akin to how we handle the bungee messaging one in a sense

electronicboy avatar Apr 25 '25 23:04 electronicboy

I think that events for plugin messaging is a pretty fairly bad way of going about this and wonder if we should adopt an api that allows for plugins to exclusively handle plugin messages in their channel (1:1, no multi registration, etc) which would potentially disable the event from firing for registered channels (or, fire it as handled)

That would probably be a better solution

lax1dude avatar Apr 25 '25 23:04 lax1dude

[...] I was trying to relay the voice chat packets through the backend paper server to allow multi-proxy setups to be created and the latency for that over localhost is low enough for preemptive multitasking to cause the events coming from the backend to be observed in the wrong order. [...]

As an example of how to handle this correctly, take WebRTC. It comprises a bunch of protocols, largely UDP-based (SCTP, RTP), to handle data ordering. You could copy this model. If you have a packet that's out of order, you wait until you get a packet that completes the gap, and then reassemble the stream.

I asked for an example precisely because this sounded like a fairly niche use case: plugin messages are not often used for tunneling streams of data!

I'm positive on the concept of implementing a dedicated API just for plugin messages. It could resolve this issue by removing the Velocity event system from the equation. I'm not much of a fan of the PluginMessageEvent and friends anyway.

astei avatar Apr 25 '25 23:04 astei

You could copy this model. If you have a packet that's out of order, you wait until you get a packet that completes the gap, and then reassemble the packet.

I'm probably not gonna do all that, unless Java Edition adopts UDP, the TCP Minecraft protocol will guarantee the plugin messages to be delivered in the correct order, so I'd much rather do some Netty hacks than implement something like that.

plugin messages are not often used for tunneling streams of data!

Yep. I'm not tunneling the voice data through the backend, I'm using the backend as a webrtc signaling server. I hate using plugin messages for all this for a lot of different reasons and am working on moving away from in the client, I'm just keeping them for communication with the backend so I don't have to open side channels or break the protocol.

I'm positive on the concept of implementing a dedicated API just for plugin messages. It could resolve this issue by removing the Velocity event system from the equation. I'm not much of a fan of the PluginMessageEvent and friends anyway.

I could not agree more

lax1dude avatar Apr 25 '25 23:04 lax1dude

Do you have opinions about providing a handler type alongside classic byte arrays to give more direct access to the pooled ByteBuf for experienced users? The ByteBuf could be wrapped in some interface to allow it to be read/written to by plugins, and have the contract be that attempting to access it after the serialization/deserialization handler has returned is not safe. A minimal implementation could serialize the plugin message through sending a lambda through the Netty channel and then invoking it in the encoder. Deserializing the plugin message could be handled by registering a deserialization lambda that would be invoked in one of the decoders. Allocators and reference counting would not be exposed. This way its using pooled buffers end-to-end, doesn't copy it, and only allocates and releases those pooled buffers from within the event loop the allocator belongs to. It also wouldn't leak memory unless someone deliberately uses reflection to get the actual internal ByteBuf to manipulate the reference count.

lax1dude avatar Apr 26 '25 01:04 lax1dude

...

ghjjhghj avatar Apr 27 '25 17:04 ghjjhghj