[Java] fast serialization path for String KV map
Feature Request
Map<String, String> is very common in java, we can provide a fast serialization path for it to provide faster performance.
Is your feature request related to a problem? Please describe
No response
Describe the solution you'd like
Add a StringMapSerialization util class in org.apache.fury.serializer.collection package:
class StringMapSerialization {
/**
* Write string chunk until there isn't any entry left.
*/
public static void writeStringChunks(
MemoryBuffer buffer,
Entry<String, String> entry,
Iterator<Entry<String, String>> iterator) {
}
/**
* Write string chunk until there isn't any entry left or chunk size reached max value..
*/
public static Entry<String, String> writeStringChunk(
MemoryBuffer buffer,
Entry<String, String> entry,
Iterator<Entry<String, String>> iterator) {
}
/**
* Write string chunk until next entry is not string type.
*/
public static Entry writeChunk(
MemoryBuffer buffer,
Entry<String, String> entry,
Iterator<Entry> iterator) {
}
/**
* Read all string kv chunks and put it into map until all chunks are read.
*/
public static void readChunks(
MemoryBuffer buffer, Map<String, String> map, long size, int chunkHeader) {
}
public static int readChunk(
MemoryBuffer buffer, Map<String, String> map, long size, int chunkHeader) {
}
}
Add fast path in AbstractMapSerializer to forward implementation into StringMapSerialization
Describe alternatives you've considered
No response
Additional context
#2025
Hi @chaokunyang , I'm interested in this issue and would like to try it. However, I'm unable to determine the key difference between the path of the StringMapSerialization and AbstractMapSerializer. If we could declare the key type and value type, will it be as fast as StringMapSerialization?
@jayhan94 The AbstractMapSerializer handles serialization for different types of maps, where key and value types can vary from one map to another. This variability prevents the JVM's Just-In-Time (JIT) compiler from effectively inlining the read and write methods for key and value serializers. However, maps with string keys and values are so prevalent that they merit a special code path to facilitate JIT inlining.
An alternative approach involves adding a fast path before invoking the read and write methods for key and value serializers within AbstractMapSerializer. This optimization could also improve performance. Maybe we could follow this way
As I understand it, the virtual function call prevents JIT from inlining the serializer.write/read methods for the key and value types. If we were able to declare the key/value serializers like StringSerializer stringSerializer, JIT would be able to inline them. Is that correct? @chaokunyang
Yes, you are right.