fury icon indicating copy to clipboard operation
fury copied to clipboard

[Java] native object copy support

Open DemonJun opened this issue 2 years ago • 3 comments

During data stream processing, it is often necessary to deep copy data for multiple times, especially when there are too many class attributes and subclasses, manual deep copying is too much trouble. Similarly:

@SuppressWarnings("unchecked")
public static <T> T clone(T ob) {
  MemoryBuffer memoryBuffer = getAndClearMb();

  Fury fury = get();

  fury.serializeJavaObject(memoryBuffer, ob);

  return (T) fury.deserializeJavaObject(memoryBuffer, ob.getClass());
}

It would be nice to be able to use zero-copy to further improve performance.

DemonJun avatar Oct 19 '23 06:10 DemonJun

One of the solution is adding a copy interface to io.fury.Serializer:

  public T copy(T value) {
    throw new UnsupportedOperationException(String.format("Copy for %s is not supported", value.getClass()));
  }

For immutable types such as String/java.time.*, this implementation just return the passed object. For mutable object such as ArrayList/HashMap/POJO, we copy the data recursive in copy implementation.

chaokunyang avatar Oct 20 '23 05:10 chaokunyang

I want to try it,but I may need a little time to learn this

caicancai avatar Oct 24 '23 13:10 caicancai

I want to try it,but I may need a little time to learn this

Looking forward to it! FYI, here are my some thoughts, you can take it for inspiration:

  • Design a copy interface
  • Serializer implement the copy interface, throw UnsupportedException
  • For non-jit serializer, override the copy interface to implement copy
  • For immutable object such as String, java.time, just return itself
  • For mutable object, create new object and set all attributes
  • For pojo/bean/record object, implement the copy in a separate class, and forward the copy to that class to reuse the copy implementation in ObjectSerializer/CompatibleObjectSerializer
  • For JIT serializer, don't generate copy code in previous serializer builder. copy is not needed by all scenarios, generate code for copy will make the jit slower and use more metaspace. Instead, we should generate a class which implement the copy interface and forward the copy to that class. For example, we can add a copy forward implemetation in io.fury.builder.Generated.GeneratedSerializer:
    public Object copy(Object o) {
      Copy copier = this.copier;
      if (copier == null) {
        this.copier = copier = classResolver.getJITCopier(o.getClass());
      }
      return copier.copy(o);
    }

chaokunyang avatar Oct 24 '23 15:10 chaokunyang