PyBitmessage
PyBitmessage copied to clipboard
Network I/O buffer operations slow
Operations like slice_write_buf
, slice_read_buf
, ~~append_write_buf
~~, ~~read_buf.extend
~~ are slow because they copy data around and reallocate memory. There are several things which can be done to improve this without major refactoring:
- preallocate bytearray,
buffer = bytearray(max_size)
. This has some tradeoffs, e.g. more memory required, and removing data from the buffer then takes more time (so we need to avoid removing data from it). - use
recv_into
/recvfrom_into
instead ofrecv
/recvfrom
. These put data directly into the buffer rather than allocating new strings - use memoryview instead of array slice when parsing data
- instead of slicing the front of the buffer, use some other method, e.g. using memoryview as well, see here: https://stackoverflow.com/questions/15962119/using-bytearray-with-socket-recv-into#15964489
- if we can somehow use a separate buffer for each command, we can avoid locking, allowing more CPU to be used on multi-core systems.
I ran some benchmarks, using bytearray slicing and appending has a performance of about 1MB/s, even when using bytearrays. Preallocating buffers can do about 20GB/s (20k times better), and using a slice of memoryview about 6GB/s (6k times better). Obviously it depends on other criteria, I was using 1kB chunks of data within the buffer.
Some operations don't work on buffers, e.g. you can't use a buffer slice as a dict key, but I think most of these have already been addressed earlier.
Edit: Appending to bytearrays doesn't seem to cause performance problems, only slicing from the beginning.