neofs-node
neofs-node copied to clipboard
Reuse object/data memory for replication workers
Is your feature request related to a problem? Please describe.
I'm always frustrated when I'm looking at the replication worker code. It takes and object from the storage, decompresses it, unmarshals it and then pushes the result to some other node. It allocates for the raw data, it allocates for the decompressed data, it allocates for the object and all of its fields. For big objects this means a heck of a lot of allocations.
Describe the solution you'd like
Have one per-replicator data buffer for raw data, one for decompressed data, one object. Reuse them. Likely this is not supported by our APIs at the moment, but this can probably be changed.
Additional context
#2300/#2178
i suggest to start from size-segmented sync pool as more simple approach
in current situation, when worker 1:1 individual object, per-worker static buffer residence could lead us to bad memory utilization with large dispersion of objects up to 64M. It is worth thinking about the adaptive size of this buffer. For example:
- start with a relatively small size
S
(e.g. 256K) - as large objects are processed, allocate buffers dynamically with counting
MISS
(quantity or excess average) - when
MISS
exceeds the specified limit, the buffer grows (e.g. doubled or +average) - shortening can be done in the reverse way
For this approach to be effective, a more adaptive work queue may be needed. To me, in this approach complexity could be above efficiency
an alternative approach could be a replication batch: all Policer workers pack to-be-replicated objects into a single limited buffer which is flushed after filling or by a timer. IMO managing the batch size will be simpler and more efficient than current tuning of worker pool capacity which relies only on incoming traffic and not outgoing. Per-node batches would allow background replication to be correlated with external traffic (PUT) and simplify prioritization model