com.unity.netcode.gameobjects icon indicating copy to clipboard operation
com.unity.netcode.gameobjects copied to clipboard

Jitter causes immediate closed connection

Open daalen opened this issue 3 years ago • 5 comments

When setting a small jitter in a Unity Transport component (say 3 ms), a connecting client will be almost immediately disconnected with the following error: "Couldn't add payload of size [some small number] to reliable send queue. Closing connection 1 as reliability guarantees can't be maintained. Perhaps 'Max Send Queue Size' (98304) is too small for workload." The small number is typically of the order of 100. This did not happen previously to 1.0.0-pre6.

daalen avatar May 04 '22 20:05 daalen

I'm noticing a similar thing with any packet loss simulation, my client just disconnects immediately. And any jitter causes my unreliable rpcs to just never arrive most of the time

SlimeQ avatar Jul 06 '23 23:07 SlimeQ

Using the built-in jitter and packet loss tools it is very easy to recreate this. I independently ran into this doing network condition testing... introducing small jitter or sufficient packet loss would lead to my clients disconnecting almost immediately.

This doesn't seem production-ready. There needs to be more leeway for a client to recover from a brief (albeit severe) drop in network quality.

zachstronaut avatar Sep 13 '23 19:09 zachstronaut

@zachstronaut Would you mind sharing your UnityTransport (with the debug simulator settings expanded) you used to get clients to drop almost immediately? I used 200ms jitter and 20% packet drop rate (well beyond what you would see in a real world scenario) and had no issues with clients disconnecting (I even let several clients just run for almost 15 minutes with those settings). image

I then adjusted this to 50% packet loss at a 200ms jitter and let that run for awhile... of course it was very laggy and would pause for a few seconds... but 50% packet loss is a very unrealistic setting. image

Some things to remember when testing poor network conditions:

  • Typically it isn't all instances that have continually poor network latency conditions (i.e. jitter) coupled with a very high packet loss %.
    • A more realistic scenario is one or two instances.
      • One scenario worth testing is where the host or server has the poor conditions with clients that have "mild-medium" conditions (i.e. latency of 120ms with a jitter of 30ms will yield a latency that ranges from 90ms to 150ms over time).
      • Another scenario worth testing is where the host and most clients have "mild-medium" conditions and one or two clients have poor conditions.

Remember that the debug simulator packet loss is bi-directional (inbound and outbound), so if you just make a development build that has packet loss with jitter and then just use that for all instances you are effectively doubling the values: A host and client that both have 10% packet loss (inbound and outbound) can yield upwards of 20% packet loss since both sides are dropping packets outbound and inbound.

Typically, something like a 7% packet loss would be the equivalent of a 2G or 2.5G cellular connection. The majority of the time, if someone is experiencing more than 2% packet loss consistently then there is very likely some form of hardware oriented issue:

  • A cable can be "nicked" (slightly cut so the protective shielding is removed/damaged) somewhere between the source and destination ISP POP.
  • Cabling in the home or the wall to modem cable is bad
  • The modem itself could be failing (i.e. power surges can sometimes cause this)
  • I have heard of scenarios where competing WIFI devices in apartments where several WIFI devices are in close proximity can cause issues.
  • Cellular devices at the very edge of a cellular tower for an extended period of time can lead to high latency and higher than normal packet loss.
    • Scenarios where a cellular device switches cell towers (i.e. driving on the highway or the like) can cause a temporary higher than normal latency and can potentially lead to a slight spike in packet loss (i.e. from 0.75%-1.25% to 2.5% to 4%), but that is typically a brief period of time.
  • Then there is network congestion, where sharing the same internet connection where the sum of all connections exceeds the router/modem and/or the IPS provided connection's maximum inbound or outbound bandwidth (i.e. getting bandwidth throttled).
    • The packet loss and latency incurred varies depending on many factors here, but it shouldn't exceed 10-20% and definitely not exceed 50%.

Unless you are making a product specifically for rural areas or areas where there is a lack of "high speed" internet, you are pretty safe to test between a 2.5% and 5% continual packet loss range...and anything beyond 10% is very abnormal (at least if you want to be playing any kind of online game that isn't slow paced like a turn based game or the like).

Either case, if you could provide those settings (or a screenshot) it would be greatly appreciated as I do want to try and replicate the dropped client scenario in the event there is something we could adjust to make NGO more resilient (of course, if you were testing anything above 50% packet loss then it very likely could be the UnityTransport Heart Beat Timeout that is causing the clients to disconnect...and if you want to account for such a high % of packet loss you would just need to increase that property until the clients stop disconnecting...but...again...50% or higher packet loss is a very unlikely scenario).

NoelStephensUnity avatar Oct 03 '23 17:10 NoelStephensUnity

I should add that when I had this problem i was on 1.5.0. haven't tried on the later versions

it was happening even at 1 packet loss though which was the lowest I could go (because int)

SlimeQ avatar Oct 03 '23 18:10 SlimeQ

@NoelStephensUnity Hey Noel! I did realize that packet delay would be doubled, but I don't think I made the kind of obvious mental connection that jitter and drop rate were also doubled for the same reason. I'm guessing I had some pretty outrageous values for these because of that! Also, fwiw, we're only just going to 2022 LTS now, so we had been on Unity Transport 1.4.0.

If I'm able to reproduce this dropping problem again, I'll get back to you here. If I don't get back to you, let's assume I used crazy test values because I forgot about the doubling.

zachstronaut avatar Oct 03 '23 18:10 zachstronaut