parallelly
parallelly copied to clipboard
Add serializedSize() for calculating the size of an object as the number of bytes serialized
Background
The future package uses a complex, inefficient, ad-hoc approach, which is implemented in pure R, to estimate the object size on an R object. It is used to estimate the total size of all "global" variables that is to be sent to the parallel worker;
- https://github.com/HenrikBengtsson/future/blob/a83c9b8d279d68c29d3773054e28bbb80f393cde/R/globals.R#L357-L359
- https://github.com/HenrikBengtsson/future/blob/a83c9b8d279d68c29d3773054e28bbb80f393cde/R/utils.R#L326-L455
Task
Implement parallelly::serializedSize()
based on serializer::calc_serialized_size()
(https://github.com/coolbutuseless/serializer), which is very fast and has a very low memory footprint ($O(1)$). /ht @coolbutuseless.
The reason for implementing it in parallelly and not future is that the latter is still a pure R package, whereas parallelly already has one C-based function. Also, future already imports parallelly, so not an added dependency there.