kryo icon indicating copy to clipboard operation
kryo copied to clipboard

optimize serialization size for primitive arrays

Open jhsenjaliya opened this issue 3 years ago • 3 comments

Is your feature request related to a problem? Please describe.

primitive arrays pre-occupies space as per the size of the elements, but its perfectly possible that most of the space is not used while serializing

Describe the solution you'd like Instead of writing size and data from 0 to size, Improvement proposal is to serialize array as per following

size -- defined size for the array to be serialized, as it is today if(primitive_array_optimization_configured){ actual_size -- this is the actual size of array where user has set any value. data -- from 0 to actual_size }else{ data -- from 0 to size }

where actual_size is calculated by reducing size until non default value is detected ( ex: 0 for int array )

Describe alternatives you've considered None available to extend the array serializer

Additional context Having this feature can greatly benefit size of the serialized object where primitive arrays are being used.

Happy to hear comments and suggestions.

jhsenjaliya avatar Sep 07 '22 00:09 jhsenjaliya

The disadvantage is that you will have to traverse the array multiple times, or write the data in reverse order. I'm not sure this makes sense as a general addition to Kryo, but you should have no problem writing such custom SparseArraySerializers yourself and using them instead of the default implementation.

theigl avatar Sep 08 '22 14:09 theigl

actually u would only need to traverse the array once ( from end until u find non-default value), and this can only be activated with configuration so default behavior wont change.

jhsenjaliya avatar Sep 08 '22 20:09 jhsenjaliya

Please create a PR with an implementation for one of the default array serializers. I'm still not sure this should be provided by Kryo out of the box, but we can discuss it further.

theigl avatar Sep 08 '22 21:09 theigl