parallelly icon indicating copy to clipboard operation
parallelly copied to clipboard

Add serializedSize() for calculating the size of an object as the number of bytes serialized

Open HenrikBengtsson opened this issue 10 months ago • 0 comments

Background

The future package uses a complex, inefficient, ad-hoc approach, which is implemented in pure R, to estimate the object size on an R object. It is used to estimate the total size of all "global" variables that is to be sent to the parallel worker;

  • https://github.com/HenrikBengtsson/future/blob/a83c9b8d279d68c29d3773054e28bbb80f393cde/R/globals.R#L357-L359
  • https://github.com/HenrikBengtsson/future/blob/a83c9b8d279d68c29d3773054e28bbb80f393cde/R/utils.R#L326-L455

Task

Implement parallelly::serializedSize() based on serializer::calc_serialized_size() (https://github.com/coolbutuseless/serializer), which is very fast and has a very low memory footprint ($O(1)$). /ht @coolbutuseless.

The reason for implementing it in parallelly and not future is that the latter is still a pure R package, whereas parallelly already has one C-based function. Also, future already imports parallelly, so not an added dependency there.

HenrikBengtsson avatar Apr 10 '24 03:04 HenrikBengtsson