message-io
message-io copied to clipboard
Improving `Decoder` performance using `Read` trait.
The Decoder
is used by the FramedTcp
transport to transform a stream-based protocol (TCP) into a packet-based protocol that fits really well with the concept of message.
The Decoder
collects data from the stream until it can be considered a message. In that process, each chunk of data received from the network is written in a temporal buffer. If that data is not yet a message, then, de Decoder
copies from that buffer to its internal buffer in order to wait for more chunks.
This last copy can be avoided if we are able to read directly into the decoder. To get this, the decoder could expose its buffer in order to allow the stream.read()
dumping its data directly into the decoder, or even better, the Decoder
can receive a Read
trait object (that would be the socket) from which extract the data. Something similar to:
Decoder::decode_from(&self mut, reader: &dyn Read, impl decoded_callback: impl FnMut(&[u8]) -> Result<()>
Note that since it works in a non-blocking way, several calls to read must be performed inside this function until receiving a WouldBlock
io error.
Hi, I'm new to the code base and I don't have any experience contributing to open source but I want to start and offer my help. Would you be able to give me some pointers on how I can get familiar with the code base enough to tackle this issue?
Hi @hasanhaja, thanks for your help!
This improvement is quite localized in the library and only two files should be updated:
-
src/util/encoding.rs which contains the
Decoder
that should be modified adding thedecode_from
method. (some unit tests should be added too here to check this new method). TheDecoder
is in charge of accumulating the incoming data until that data represents a message. -
src/adapters/framed_tcp.rs where the
FramedTcp
transport is implemented and uses theDecoder
(Only thereceive()
method should be modified). Currently, it makes use of thedecode()
method, and it should be changed by the newdecode_from()
. Also, this update will avoid thebuffer
usage of thereceive()
method.
To make this change. it is important to be familiar with the Read
trait.
Do not hesitate to ask any doubt or any new ideas to tackle the problem. 😃
Hi @lemunozm, thank you for the pointers! I'm looking into the code and getting a feel for what's going on now, and I'll circle back with questions soon.
Task management
@hasanhaja Hi @lemunozm, thank you for the pointers! I'm looking into the code and getting a feel for what's going on now, and I'll circle back with questions soon.
Todo
NOTE: I will add more todos as my exploration develops.
- [ ] Familiarize myself with Read Trait
- [ ] Understand how the loop in the
try_decode()
method works or even what it's for
Hi @lemunozm , is this still relevant issue? 🤔
Hi @seonwoo960000,
It's just a matter of performance improvement. I'm not sure how much it can pump up the performance in real scenarios, to be honest. As far as I know, there is no work done on it right now