message-io icon indicating copy to clipboard operation
message-io copied to clipboard

Improving `Decoder` performance using `Read` trait.

Open lemunozm opened this issue 3 years ago • 6 comments

The Decoder is used by the FramedTcp transport to transform a stream-based protocol (TCP) into a packet-based protocol that fits really well with the concept of message.

The Decoder collects data from the stream until it can be considered a message. In that process, each chunk of data received from the network is written in a temporal buffer. If that data is not yet a message, then, de Decoder copies from that buffer to its internal buffer in order to wait for more chunks.

This last copy can be avoided if we are able to read directly into the decoder. To get this, the decoder could expose its buffer in order to allow the stream.read() dumping its data directly into the decoder, or even better, the Decoder can receive a Read trait object (that would be the socket) from which extract the data. Something similar to:

Decoder::decode_from(&self mut, reader: &dyn Read, impl decoded_callback: impl FnMut(&[u8]) -> Result<()>

Note that since it works in a non-blocking way, several calls to read must be performed inside this function until receiving a WouldBlock io error.

lemunozm avatar Apr 14 '21 11:04 lemunozm

Hi, I'm new to the code base and I don't have any experience contributing to open source but I want to start and offer my help. Would you be able to give me some pointers on how I can get familiar with the code base enough to tackle this issue?

hasanhaja avatar Apr 17 '21 20:04 hasanhaja

Hi @hasanhaja, thanks for your help!

This improvement is quite localized in the library and only two files should be updated:

  • src/util/encoding.rs which contains the Decoder that should be modified adding the decode_from method. (some unit tests should be added too here to check this new method). The Decoder is in charge of accumulating the incoming data until that data represents a message.
  • src/adapters/framed_tcp.rs where the FramedTcp transport is implemented and uses the Decoder (Only the receive() method should be modified). Currently, it makes use of the decode() method, and it should be changed by the new decode_from(). Also, this update will avoid the buffer usage of the receive() method.

To make this change. it is important to be familiar with the Read trait.

Do not hesitate to ask any doubt or any new ideas to tackle the problem. 😃

lemunozm avatar Apr 18 '21 19:04 lemunozm

Hi @lemunozm, thank you for the pointers! I'm looking into the code and getting a feel for what's going on now, and I'll circle back with questions soon.

hasanhaja avatar Apr 20 '21 15:04 hasanhaja

Task management

@hasanhaja Hi @lemunozm, thank you for the pointers! I'm looking into the code and getting a feel for what's going on now, and I'll circle back with questions soon.

Todo

NOTE: I will add more todos as my exploration develops.

  • [ ] Familiarize myself with Read Trait
  • [ ] Understand how the loop in the try_decode() method works or even what it's for

hasanhaja avatar May 06 '21 15:05 hasanhaja

Hi @lemunozm , is this still relevant issue? 🤔

seonWKim avatar Feb 17 '24 13:02 seonWKim

Hi @seonwoo960000,

It's just a matter of performance improvement. I'm not sure how much it can pump up the performance in real scenarios, to be honest. As far as I know, there is no work done on it right now

lemunozm avatar Feb 19 '24 07:02 lemunozm