bincode icon indicating copy to clipboard operation
bincode copied to clipboard

u16 serialized to 3 bytes <or lose trade of> enum serialized to 4 bytes (allow #[repr(u8)])

Open cybersoulK opened this issue 2 years ago • 20 comments

cybersoulK avatar Aug 07 '22 01:08 cybersoulK

i am reading here: https://github.com/bincode-org/bincode/blob/trunk/src/varint/encode_unsigned.rs

is says that it should be 3 bytes? but that's a huge downside, because most of the time, the value will be above 255, more exactly 99.6% of the time, rather than below 255, and that means about 50% more data in average.

cybersoulK avatar Aug 07 '22 01:08 cybersoulK

for example. in my engine implementation, f32 is squeezed into a u16 which is the perfect size, so the data that is handled over the network is 50% more.

and entity ids are u32, but more likely to be in the 255 to 3000 range, and serialized into 3 bytes, instead of the ideal 2

no matter what, 255 to 65000 must be 2 bytes, this is the most important value in networking and games to have right

cybersoulK avatar Aug 07 '22 02:08 cybersoulK

We can use a better method "if the MSB is set, there's more data"

U16

0 to 128 - 1 byte,     
128 to 32768 - 2 bytes,    
32768 to u16::Max - 3 bytes,  

 Instead of the inefficient  

0 to 255 - 1 byte,    
255 to U16::Max - 3 bytes,  

U32

0 to 128 - 1 byte,     
128 to 32768 - 2 bytes,    
32768 to 2147483648 - 4 bytes
2147483648 to U32::Max - 5 bytes

 Instead of the inefficient  

0 to 255 - 1 byte,    
255 to 65536 - 3 bytes,  
 65536 to U32::Max - 5 bytes

because it's only benefiting the ranges of 128 to 255, and 32768 to 65536, which are much less probable

cybersoulK avatar Aug 07 '22 14:08 cybersoulK

If you don't want to use the varint encoding, you can use either config::legacy() (compatible with bincode 1), or config::standard().with_fixed_int_encoding().

If you want to control this on a case-by-case basis, I can recommend making your own struct U16(u16); type, that always encodes/decodes as a fixed int by calling self.0.to_le_bytes() directly.

VictorKoenders avatar Aug 07 '22 15:08 VictorKoenders

@VictorKoenders but how do you think of 0 to 128 - 1 byte,
128 to 32768 - 2 bytes,
32768 to u16::Max - 3 bytes,

i am not an expert, but it seems possible

because i would prefer cutting precision in half, so save bytes

cybersoulK avatar Aug 07 '22 15:08 cybersoulK

It sounds like a niche optimization that we probably won't support in bincode. We can't make breaking changes in the data format by changing how u16 gets (de)serialized, and I don't think this case warrants introducing a whole new integer encoding config.

VictorKoenders avatar Aug 07 '22 15:08 VictorKoenders

also in the case of ids, where they are generalized by a u32, but could be in the range u8, u16 and u32, to try and stay as small as possible, it will give an extra byte, even if it's u8::Max + 1

cybersoulK avatar Aug 07 '22 15:08 cybersoulK

@VictorKoenders it's not niche lol. this is fundamental in game networking

cybersoulK avatar Aug 07 '22 15:08 cybersoulK

We can leave this issue open as a tracking issue to see if other developers are interested for a new integer encoding

VictorKoenders avatar Aug 07 '22 15:08 VictorKoenders

@VictorKoenders what do you recommend on a entity Id where you want to have

0 - u8::Max / 2. - 1 byte u8::Max / 2 - u16::Max / 2 - 2 bytes u16::Max / 2 - u32::Max / 2 - 4 bytes u32::Max / 2 - u32::Max - 5 bytes

because the only option i have is to wrap it into an enum enum ID { u8(u8), u16(u16) u32(u32) }

but then that adds 1 byte of enum anyways... i don't see any options here for me. but it seems more of a fundamental alghoritm change rather than just a niche thing, because i have thought of 2 ways where it would improve my own software, and none where it would hurt someone else's

cybersoulK avatar Aug 07 '22 15:08 cybersoulK

You can implement a custom Encode and Decode and do all the optimizations your use case needs

VictorKoenders avatar Aug 07 '22 15:08 VictorKoenders

i will try.

but this case still gives 3 bytes

@VictorKoenders #[derive(Encode, Decode, Debug, Clone)] struct U16(u16);

fn main() {

let test = U16(270);
let m = bincode::encode_to_vec(&test, bincode::config::standard()).unwrap();

}

cybersoulK avatar Aug 07 '22 15:08 cybersoulK

I said "custom Encode and Decode", not the automatic one. See the following documentation for more info:

  • Encode: https://docs.rs/bincode/2.0.0-rc.1/bincode/enc/trait.Encode.html#implementing-this-trait-manually
  • Decode: https://docs.rs/bincode/2.0.0-rc.1/bincode/de/trait.Decode.html#implementing-this-trait-manually

VictorKoenders avatar Aug 07 '22 15:08 VictorKoenders

@VictorKoenders i just realized that in my engine, using with_fixed_int_encoding will double the serialization performance, and keep very similar size, so having with_fixed_int_encoding is very important for my project.

but it makes it critical to be able to define an enum with size of u8, because as you can see, and i tried multiple crates, the enum size will increase data output by 20%.

bincode len: 550208 postcard len: 474178 borsh len: 459004

would you consider allowing #[repr(u8)]? this crate would be excellent then

cybersoulK avatar Aug 08 '22 15:08 cybersoulK

No, see the readme for more information: https://github.com/bincode-org/bincode#why-does-bincode-not-respect-repru8

VictorKoenders avatar Aug 08 '22 15:08 VictorKoenders

i read it before. "Currently we have not found a compelling case to respect #[repr(...)]" that's why i am providing with a case

cybersoulK avatar Aug 08 '22 15:08 cybersoulK

@VictorKoenders New compelling Case this is a practical size benchmark for a udp game server:

bincode len: 526367 (with varint) bincode len: 550208 (without varint & +100% performance) bincode len: 459004 (without varint & with respected #[repr(u8)] & +100% performance)

i hope you consider this case, and not as a niche, since there are many people developing multiplayer games with bincode in rust discord server.

cybersoulK avatar Aug 08 '22 16:08 cybersoulK

actually i meant 100%, twice benchmarked in mac m1 and linux x64 with criterion

cybersoulK avatar Aug 08 '22 16:08 cybersoulK

Given the complexity and ongoing maintenance burden, as well as the footguny nature of #[repr(u8)], we have a pretty high bar that a usecase has to meet for its inclusion. Bincode was not intended for minimizing the number of bytes used for encoding, or for being the fastest encoding, instead, its intended to be a happy medium while keeping the library simple and lightweight.

We ask that the following be demonstrated, preferably with sufficiently non-synthetic benchmarks, before we can open a proper dicussion on inclusion:

  1. That bincode is actually the bottleneck

    Do those extra bytes actually make the game less playable over a realisitic network connection? Is the performance hit of varint encoding actually causing bincode to encode/decode slower than a realistic network connection?

  2. That adding on fast compression would result in unacceptable performace

    Does simply using some fast compression, such as lzo, actually result in unacceptable performance? Does the game become measurably less playable with a compression step?

  3. That no other well-maintained libraries are suitable for the given purpose

    Why would postcard or borsh not work for your usecase? Bincode doesn't have to and shouldn't adapt to every usecase, if there is a library that better suits your usecase, by all means, use it instead.

nmccarty avatar Aug 08 '22 16:08 nmccarty

@nmccarty my project involves in taking everything to the limits, so it's natural that i am pushing you as well. i want all the crates linked to my project to be better too.

allowing #[repr(u8)] seemed like just switching a few buttons, but maybe it's naive thinking.

i will test with compression and report back. yes i have been investigating borsh as an alternative

cybersoulK avatar Aug 08 '22 18:08 cybersoulK

@nmccarty

  1. i tested 4 different compressors, and the fastest compressor adds 1000% of time compared to just serialization alone.
  2. postcard varint is always enabled, and u16 above 20000 will be 3 bytes. also varint slows down serialization by 2 times borsh does not support serde, and makes it more challenging to serialize things like glam crate and other external types

(1) i calculated, and serialization time can affect the frame-rate significantly in my project

i am trying to work with borsh now, but will revert back to bincode if enums can be 1 byte

cybersoulK avatar Aug 12 '22 22:08 cybersoulK

Re the last point: Is this actually supported by benchmarks? Can I see those benchmarks?

Are you making use of optimizations like streaming to the socket and/or running the serialization in a different thread? Have you looked into ways to reduce the amount of data you need to serialize with each frame?

We need some more context here to determine that bincode is being used in a way consistent with its design philosophy before we can consider making changes to the design to support this use case. If you've found yourself in as situation where serialization is so critical to performance and such fine grained control is necessary, then you should be looking at formats that focus on those points like abomonation, protobuf, or cap 'n proto. Bincode is not designed to be inside a hot loop, its not intended to be a limit pushing protocol, something like abomonation, however, is.

nmccarty avatar Aug 13 '22 01:08 nmccarty

it's not even about performance benchmarks, it's a fundamental feature. 1 byte is enough to describe an enum. nobody will make 255 variants. that case is the actual niche.

my game state data is compact, and that's why i am running against problems with serialization. positions are u16, rotations are u8, i wrap these in an enum to be able to layer multiple distances of these u16 positions.

position(2 + 2 + 2) + rotation(1 + 1 + 1) + (enum)4 * 100 = 1300 bytes (no varint)
position(3 + 3 + 3) + rotation(1 + 1 + 1) + (enum)1 * 100 = 1300 bytes (varint and 200% slower)

what it must be:
position(2 + 2 + 2) + rotation(1 + 1 + 1) + (enum)1 * 100 = 1000 bytes

as a result, bincode artificially increases my data by 22% because of either the enum 4 bytes or the u16 be 3 bytes, so 22% less throughput.

for the compressor to be able to clear those 22%, it will have to add artificial load of 4 milliseconds per player, instead of 0.4 milliseconds per player with serialization only

adding compression, creating threads, and the compression 4 milis will choke the thread anyways at around 200 players, makes it complicated, and not scalable. All of these downsides, just because of an enum being 4 bytes while it CAN and fundamentally is 1 byte (255 variants)

cybersoulK avatar Aug 13 '22 17:08 cybersoulK

@nmccarty I will close the issue if this seems like a niche optimization

cybersoulK avatar Aug 13 '22 18:08 cybersoulK

As @nmccarty said, this issue is outside the scope of what bincode is intended for. It sounds like in your case you are serializing and transmitting the game state every frame, something far outside bincode's intended area. I recommend looking into another format such as flatbuffers or abomonation if you don't care about safety and just want raw speed.

Since this is outside the scope of the project. I am closing this issue.

ZoeyR avatar Aug 13 '22 19:08 ZoeyR