rmnp icon indicating copy to clipboard operation
rmnp copied to clipboard

Switch to xxh

Open Flameborn opened this issue 4 years ago • 4 comments

Hello,

I'm not sure if you're interested in this, but I recently investigated some ways of speeding up RMNP.

One way of doing this is via switching to a more modern, and faster hashing library from CRC32, which is currently XXH. It's incredibly fast for both small and large data, and it indeed makes a huge difference with a large amount of connections/packets, such as in the case of a MMORPG.

The current pure go implementation I use almost matches the reference C implementation in speed.

There are a few drawbacks:

  • Due to the current XXH32 implementation's less optimized state, I opted for XXH64. This resulted in doubling the hash field in the packet headers, from uint32 to uint64, so the max packet header size now is 19 instead of 15.
  • This change will make the C# port incompatible, however, I believe that there are C# implementations of XXH as well.

I've also added a go.mod file, as well as changed the package imports for the example server and client to the current one.

All tests pass, and the examples work as expected.

I'm also looking into switching away from Golang's net package, as the performance of the current UDP implementation could be better.

Flameborn avatar Oct 23 '21 18:10 Flameborn

Hi, thanks for taking the time and suggesting changes! (The mod file was longer overdue :sweat_smile:)

I have never heard of XXH before but it seems like a very good solution. There seem to be two major libraries for go that implement the algorithm: https://github.com/cespare/xxhash & https://github.com/OneOfOne/xxhash Do you have a particular reason for choosing the latter over the first one?

As you mentioned already, this change would definitely break compatibility with the C# version of RMNP. Right now, I do not have the time to update it, so I would suggest leaving this PR open until I have some spare time implementing this change over there as well.

The additional 4 bytes in the header should not pose a problem in my opinion. Especially, if, in theory, we can switch to the 32 bit version later on.

I never benchmarked the UDP performance of the std lib but it definitely is not the nicest API to work with, so I am open for changes requests/drafts there as well if you should get around finding something!

tim-oster avatar Oct 27 '21 04:10 tim-oster

Thanks, feel free to leave the PR open and merge when you have some time.

Regarding the various XXH implementations, I wanted to use a memory-efficient pure Go implementation. This one performs better than CGo versions, and while it’s slightly slower than the optimized assembly version you suggested, it supports seeds (which might be useful on a per user/socket basis). Currently, seeds are not used.

The other implementation is also slower when running without the optimizations via pure Go, which could be an issue on embedded devices/board computers, such as the Raspberry Pi on ARM.

Currently, my biggest issue with the standard library is packet fragmentation, any buffer sent above the specified MTU is discarded, rather than sent as fragments, so I’m looking into possibly extending the RMNP header to add a fragment bit, but this will definitely add more overhead, especially when packets are reliable and ordered.

It currently seems that the upper packet limit is around 200000 packets per socket, I’d like to double this at least, as handling 2000 users could hit this limit for an MMORPG server, I'll see if I can find (or come up with) a solution when I have some more time.

On 2021. Oct 27., at 6:22, Tim Oster @.***> wrote:

Hi, thanks for taking the time and suggesting changes! (The mod file was longer overdue 😅)

I have never heard of XXH before but it seems like a very good solution. There seem to be two major libraries for go that implement the algorithm: https://github.com/cespare/xxhash https://github.com/cespare/xxhash & https://github.com/OneOfOne/xxhash https://github.com/OneOfOne/xxhash Do you have a particular reason for choosing the latter over the first one?

As you mentioned already, this change would definitely break compatibility with the C# version of RMNP. Right now, I do not have the time to update it, so I would suggest leaving this PR open until I have some spare time implementing this change over there as well.

The additional 4 bytes in the header should not pose a problem in my opinion. Especially, if, in theory, we can switch to the 32 bit version later on.

I never benchmarked the UDP performance of the std lib but it definitely is not the nicest API to work with, so I am open for changes requests/drafts there as well if you should get around finding something!

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/tim-oster/rmnp/pull/4#issuecomment-952526048, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAHLD4E3CUWI6SCMXVHQ26DUI6ECXANCNFSM5GSRRRCA. Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

Flameborn avatar Oct 27 '21 14:10 Flameborn

I totally agree with using the non cgo version as it could introduce issues otherwise, depending on where folks want to run their servers. If this should ever cause actual performance implications, an abstraction layer for hashing could be added that supports switching out the hashing implementation.

Regarding the MTU: when initially writing RMNP, fragments were intentionally dropped because they were not required for the project at that time. Adding this however, is probably not that much work. If I am not mistaken, there are some bits unused in the descriptor flag of a packet, so the additional overhead would only impact the actual fragmented packets. The send function could take over splitting large buffers into multiple packages. I am uncertain if fragmentation should only be supported for reliable ordered packages because otherwise it could be rather difficult to make sure that all fragments have arrived before processing.

Regarding your last point: 200k packets per socket seem like a lot of packets. In which time frame does this limit exist? The question is if that problem is resolved by changing the std lib or if network hardware also enforces similar limits which could bottleneck the application. For such high-load scenarios it could be worth considering implementing a batching system into RMNP that buffers packages and sends them as one package. Of course this introduces a lot of new problems but might be a smart solution to reduce network latency as well.

tim-oster avatar Oct 29 '21 08:10 tim-oster

I agree, adding an abstraction layer could be done quite easily, if desired. Though I don’t think this should cause issues in most cases.

Regarding fragmentation, splitting the buffer up would definitely work. Unless something’s wrong with my logic, only packet reliability matters, since when one fragment arrives, we can check if the others are there too and if so, join them, otherwise store the packet for later. Of course, we’d need to define a fragment limit, and possibly a timeout as well, so that if the rest of the split packet does not arrive, we can delete, or possibly ask for a resend. I personally like this approach a lot better than what the net library does at the moment, because you have control over the amount of received data.

Regarding my connection tests, my timeframe is one second, and about 2000 users on a game server, handling about 10 packets from each. There was a test done earlier by a Stack Overflow user as well (https://stackoverflow.com/questions/64056674/bad-udp-performance-in-golang-vs-python https://stackoverflow.com/questions/64056674/bad-udp-performance-in-golang-vs-python), although using only one main Go routine to broadcast. The reason I mentioned performance is because I was wondering if I could get some more performance out of the standard library if I switched to something else, unfortunately I did not have enough time to examine other approaches just yet.

On 2021. Oct 29., at 10:15, Tim Oster @.***> wrote:

I totally agree with using the non cgo version as it could introduce issues otherwise, depending on where folks want to run their servers. If this should ever cause actual performance implications, an abstraction layer for hashing could be added that supports switching out the hashing implementation.

Regarding the MTU: when initially writing RMNP, fragments were intentionally dropped because they were not required for the project at that time. Adding this however, is probably not that much work. If I am not mistaken, there are some bits unused in the descriptor flag of a packet, so the additional overhead would only impact the actual fragmented packets. The send function could take over splitting large buffers into multiple packages. I am uncertain if fragmentation should only be supported for reliable ordered packages because otherwise it could be rather difficult to make sure that all fragments have arrived before processing.

Regarding your last point: 200k packets per socket seem like a lot of packets. In which time frame does this limit exist? The question is if that problem is resolved by changing the std lib or if network hardware also enforces similar limits which could bottleneck the application. For such high-load scenarios it could be worth considering implementing a batching system into RMNP that buffers packages and sends them as one package. Of course this introduces a lot of new problems but might be a smart solution to reduce network latency as well.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/tim-oster/rmnp/pull/4#issuecomment-954543478, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAHLD4ADRKW5H6RCJ6DZY6TUJJQ3ZANCNFSM5GSRRRCA. Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

Flameborn avatar Nov 08 '21 19:11 Flameborn