pitaya How to ensure that messages are processed in order

Apr 09 '21 14:04 suntianji123

what do you mean? inside the same session, it's expected that the messages will be processed in order because we're using either TCP or WS as acceptor. unless you're trying to implement an UDP acceptor then I don't see why this would be a concern to you

Apr 12 '21 18:04 felipejfc

Data received from the same player may be processed in different coroutines, resulting in inconsistent processing order and receiving order

Apr 13 '21 06:04 edisonwsk

Data received from the same player may be processed in different coroutines

This should not be the case, since we process all messages on the same goroutine: https://github.com/topfreegames/pitaya/blob/177cdf3913e6bd7fa5d0f0ac01dca4c7400a3985/service/handler.go#L183

Are you getting a case where this is happening? Or is it just a speculation?

Apr 13 '21 13:04 leohahn

https://github.com/topfreegames/pitaya/blob/177cdf3913e6bd7fa5d0f0ac01dca4c7400a3985/service/handler.go#L119 Messages of the same user may be executed concurrently in different coroutines

Apr 14 '21 02:04 edisonwsk

Oh, so you're talking about clients as other servers right? Because indeed, messages are not ordered between servers communicating.

However, when I mentioned client, I was talking about the device itself, and not another server.

Apr 14 '21 13:04 leohahn

The behavior of the same user should be orderly, so the messages of the same user should be processed in the same coroutine to avoid concurrent processing. The current general processing method is to determine the player's coroutine based on sessionid%threads

Apr 15 '21 07:04 edisonwsk

EDIT:

Ok now I see what you say, we have a single coroutine for receiving the messages from each client, but indeed we have multiple threads for processing them, so it is possible that we have a racing condition.

Because of the mutex nature of channels and TCP properties, we do know that never a newer message should be received here in the Dispatch method. TCP will ensure ordering in the networking layer, and because the "agent" is a single coroutine per user we are fine. However, in this method:

func (h *HandlerService) Dispatch(thread int)

Even though it should be rare, in fact we might end up processing two messages from the same user concurrently and then end up with a racing condition... Good catch!

So you have 2 options then:

Set pitaya.concurrency.handler.dispatch to 1
Inplementing a logic to stick player sessions to threadids, I think we must implement this one

Best

Apr 15 '21 14:04 felipejfc

Indeed, what I said before was wrong. The dispatch code also processes messages from the client. Really good find. IMO, I think having only one coroutine for processing player messages is enough (not one goroutine for all players though). The low message count ensures that one coroutine would not be an issue in terms of performance.

update: I'm assuming the use case is that the clients are not sending too many messages, but this could be wrong depending on the use case.

Apr 15 '21 15:04 leohahn

Another potential option would be to have a concept or "ordered packets" from the client. So we would then have two coroutines, one for multiple coroutine processing, and another channel only for messages that need to be processed in order. This is a larger change on the codebase, however, and I'm not sure if this use case is actually needed.

Apr 15 '21 15:04 leohahn

There is a solution: each coroutine has an independent processing channel, and then the user only posts the message to the coroutine where sessionid%threads is located for processing. In theory, all users can be equally distributed to different coroutines, but a certain coroutine is not excluded. Process blocking or abnormal exit causes all users assigned to the coroutine to be blocked, which should require more safeguards. Do you have a better way to deal with it?Or will there be other possible problems with this approach?

Apr 16 '21 11:04 edisonwsk

This seems to be a good solution overall. I don't see many issues here apart from the potential issue of client starving that you mentioned. I assume the consuming threads already have a recover from panics mechanism for handling these issues, but I'm not sure now.

My only concern with this solution is that the distribution of clients to channels could be an issue if we had clients with different connection times and message numbers. For example, assuming all clients send an equal amount of messages and stay connected for the same amount of time, we would not have an issue with "hot" coroutines. However, we might have a difference of message frequency and connection time from clients, which might end up creating hot coroutines, and therefore slow down consumption for some clients. However, the variance in connections should also be random more or less, so we would never concentrate all of the long-lasted clients into a single coroutine 🤔

Apr 16 '21 22:04 leohahn

Yes, this is what we are worried about that some coroutines will be heavily loaded after a period of time, and some coroutines will be idle. But it is also possible that the distribution is not too concentrated in terms of probability because of the even distribution. At present, we do not have a better solution for the time being, but we can evaluate its pros and cons based on the test data in the future. Of course, it would be better if you have a better plan. Thanks for the answer!

Apr 19 '21 06:04 edisonwsk

@edisonwsk i have implemented this solution, but I gave up in the end. because it is imperfect. it can only take effect in limited scenarios: all rpc use sticky routing or target server is standalone.

eg. when a player client send requests msg1,msg2... to backend server (serviceTypeA, suppose there are 3 nodes) , all messages arrived node1 （till here is correct）, and the receiver of serviceTypeA need to call a rpc of serviceTypeB...（the call chain may be longer） finally, serviceTypeA responds to the client. it cannot guarantee the processing order of requests.

Apr 21 '21 04:04 chgz

Thinking about it in a simpler way, did you consider just guaranteeing the ordering on client-side? For example, you can wait for the response on the client (libpitaya) before sending the next request.

Apr 21 '21 17:04 leohahn

it is similar to the BIO model. in some scenarios, it’s not enough

in fact, not all client messages require sequentiality. In my projects, the most common is the combat messages, but their goal is unique, so this is not a troublesome problem for me.

Apr 22 '21 02:04 chgz

@chgz Sorry, I did not understand what you said. I think that if a player's message is fixed in a coroutine for execution, then according to the nature of rpc's blocking and waiting to return, the execution of the message must be orderly.

https://github.com/topfreegames/pitaya/blob/177cdf3913e6bd7fa5d0f0ac01dca4c7400a3985/service/handler.go#L132 In some cases, player actions need to be orderly. For example, action A is a precondition for action B. The player quickly presses A and then B. In the case of a bad network, the two data of AB may be merged. , The server may continuously push two messages into the queue for execution. If they are processed in two coroutines, then it is possible that B has started to execute, and A has not been executed yet, which does not match the player's expected action.

Apr 22 '21 09:04 edisonwsk

Sent by the same dispatcher, they are in order. But receiver is not necessarily the same. So，in the case of multiple nodes and long call chains, we can't expect the strong order.

Apr 23 '21 03:04 chgz

We don’t need to care about the order of subsequent actions. We only need to consider that the actions of a single player are in order. It is impossible to execute concurrently if all nodes are in order.

Apr 23 '21 06:04 edisonwsk

My project is working with pitaya for a while. I think pitaya is more fit for Hall-Room Game. Not fit for MMO Game. MMO Game state synchronization is a big problem. And it need process single player message in order I think. So I am study ing another message process mechanism like protoactor.go which use ActorModel. PS. I love the route/metrics/cluster/span implement in pitaya.

Apr 30 '21 11:04 tutumagi

I agree with your opinion. @tutumagi

But, Go’s concurrency model relies on CSP. If you want to use the actor model, the complexity will increase a lot. I tend to replace the random routing algorithm with consistent hashring.

May 06 '21 09:05 chgz

od find. IMO, I

To hash the messages, pitaya need to know which "actor" is bound with the message not only the client messages but also remote messages.

Dec 11 '23 02:12 bruce1125