Mirror icon indicating copy to clipboard operation
Mirror copied to clipboard

Starting server/host after cancelling client connection attempt results in errors.

Open ArjanSioux opened this issue 4 years ago • 6 comments

Describe the bug Starting a server or host after cancelling a client connection attempt results in errors.

[IMPORTANT] How can we reproduce the issue, step by step: Either open the Tanks example, or set up a minimal project that includes NetworkManagerHUD.

Reproducing the error for host:

  1. Start the game and press the "Client" button.
  2. Before the transport-level warning about connection timeout shows up (about 10 seconds), do the following: 2.a. Press the "Cancel Connection Attempt" button. 2.b. Press the "Host (Client + Server)" button.
  3. Observe the error: "NetworkClient.AddPlayer: a PlayerController was already added. Did you call AddPlayer twice?"

Reproducing the error for server:

  1. Start the game and press the "Client" button.
  2. Before the transport-level warning about connection timeout shows up (about 10 seconds), do the following: 2.a. Press the "Cancel Connection Attempt" button. 2.b. Press the "Server Only" button.
  3. Observe the error: "Skipped Connect message handling because connection is null."

Expected behavior I'd expect the "Cancel Connection Attempt" button, which calls NetworkManager.StopClient(), to cancel the transport-level connection attempt (see Additional context!). As a result, I'd expect to see no errors and to have a started server or host.

Screenshots https://s1.gifyu.com/images/ClientCancelledStartServerError.gif

Desktop (please complete the following information):

  • OS: Windows
  • Build target: standalone
  • Unity version: 2020.3.10f1
  • Mirror branch: 53.0.0
  • Transport: KCP (Telepathy results in null dereference errors in these cases.)

Additional context

  • If you wait (about 10 seconds after pressing "Cancel Connection Attempt", before pressing "Host (Server + Client)"/"Server Only") for the transport-level warning about connection timeout to appear, everything works as expected.
  • Transport-level warning message: "KCP: Connection timed out after not receiving any message for 10000ms. Disconnecting."

ArjanSioux avatar Oct 25 '21 12:10 ArjanSioux

true story

MrGadget1024 avatar Nov 20 '21 14:11 MrGadget1024

true story

Current findings: When you cancel connection attempt, StopClient is called, which flows down to KcpClient.Disconnect(). However, the client never connected, so the connected bool is false, which means connection?.Disconnection is never called. The client is also observed to continue to try to connect despite cancellation -> if you cancel quickly and wait a while, the kcp time out message shows up anyway. So I am deducing when you do the above and start host, the earlier client connection attempt connects to this Host session (this time successfully) -> I can see it clearly, if I do the above steps and debug in NetworkManager OnServerConnect(), the function is called twice, thus triggering the "Did you add 2nd player" error.

ninjakickja avatar Dec 13 '21 02:12 ninjakickja

Which layer is this an issue in? The transport wrapper to KCP or the KCP logic itself?

SoftwareGuy avatar Dec 13 '21 03:12 SoftwareGuy

Which layer is this an issue in? The transport wrapper to KCP or the KCP logic itself?

I think the wrapper. KCPClient.cs

ninjakickja avatar Dec 14 '21 01:12 ninjakickja

in KCPClient.cs: adding else // if not connected { connection = null; } seems to solve the issue, but causes a new NRE error because connection became null. This is due to ClientDisconnect actually being called twice. Once by NetworkClient.Disconnect, and 2nd by NetworkClient.Shutdown. Any idea why it should be called twice, or why these 2 have to call disconnect separately?

ninjakickja avatar Dec 15 '21 14:12 ninjakickja

in KCPClient.cs: adding else // if not connected { connection = null; } seems to solve the issue, but causes a new NRE error because connection became null. This is due to ClientDisconnect actually being called twice. Once by NetworkClient.Disconnect, and 2nd by NetworkClient.Shutdown. Any idea why it should be called twice, or why these 2 have to call disconnect separately?

If I recall correctly, that double NetworkClient.Disconnect call is a bug that keeps rearing its head. I remember there was patches and refactoring to prevent this, but it seems it might be a (design?) issue that needs to be addressed again.

IMO ideally when NetworkClient.Shutdown is called and the client is active, then NetworkClient.Disconnect should be called and then continue with the Shutdown - don't try to disconnect twice. The easiest way of doing that would be to set a flag/boolean internally that gets checked in Shutdown - if the connection is already disconnected, there is no need to call NetworkClient.Disconnect. Although does Disconnect do anything for the host client? I can't remember...

Psuedocode:

Shutdown request
1) Is network client connected? 
If yes - call network client disconnect routine, wait until disconnected. Continue with shutdown.
If no - continue with shutdown.
2) Shutdown process
Cleanup, reset, free resources, blah blah.
3) Go home. We're done here.

SoftwareGuy avatar Dec 16 '21 07:12 SoftwareGuy

Closing as fixed (can no longer reproduce).

MrGadget1024 avatar Nov 24 '22 12:11 MrGadget1024

I can still reproduce this on Tanks and in my own project. My current workaround is to just wait for the timeout before letting the player try to host. Is there another solution that works better?

bjornthefire avatar Jul 22 '23 03:07 bjornthefire