fury icon indicating copy to clipboard operation
fury copied to clipboard

[Question] <title>When I serialize this object, enough space is allocated, but it still throws a java.lang.OutOfMemoryError: Java heap space

Open a1342772 opened this issue 1 year ago • 10 comments

public class FlatStorage implements Serializable {
   
    private MemoryBuffer buf; 
    private Map<String, int[]> featureMetadata; 
    public FlatStorage(int bufferSize) {
        this.buf = MemoryUtils.buffer(bufferSize);
        this.featureMetadata = new HashMap<>();
    }

    public void addFeature(String name, int type, int offset, int[] shape) {
        featureMetadata.put(name, new int[]{type, offset, shape[0], shape[1]});
    }

    public MemoryBuffer getBuf() {
        return buf;
    }

    public Map<String, int[]> getFeatureMetadata() {
        return featureMetadata;
    }
}

a1342772 avatar Nov 26 '24 03:11 a1342772

@chaokunyang

a1342772 avatar Nov 26 '24 05:11 a1342772

@a1342772 Could you provide a unit test, the code you provided is just a data class.

chaokunyang avatar Nov 26 '24 06:11 chaokunyang

BTW, MemoryBuffer is used by fury internally, it's just a wrapper for DirectBuffer/ByteBuffer/byte[], why do you need to serialize fury MemoryBuffer ?

If you do need to serialize MemoryBuffer, we need to add a new Serializer for it too.

chaokunyang avatar Nov 26 '24 06:11 chaokunyang

Another thing is that how do we serializer MemoryBuffer? MemoryBuffer has a readerIndex, do we write data between readerIndex - size or serialize the whole buffer?

chaokunyang avatar Nov 26 '24 06:11 chaokunyang

@chaokunyang Oh, I see. How does Fury perform with arrays? I want to replace MemoryBuffer with arrays.

a1342772 avatar Nov 26 '24 07:11 a1342772

What do you mean Fury perform with arrays?

chaokunyang avatar Nov 26 '24 07:11 chaokunyang

yes @chaokunyang

a1342772 avatar Nov 26 '24 07:11 a1342772

@a1342772 I don't quite understand what you mean, could you provide more details what do you mean Fury perform with arrays?

chaokunyang avatar Nov 26 '24 07:11 chaokunyang

Compared to Protobuf, the speed of serialization and deserialization as well as the compression ratio.

a1342772 avatar Nov 26 '24 07:11 a1342772

@a1342772 Fury supports zero-copy serialization of primitive array, there is no cost for serializing such objects, and of course no compression, the serialized size of array will be n_elements * size_of(element_type).

You could use zero-copy serialization by https://fury.apache.org/docs/guide/java_object_graph_guide#zero-copy-serialization:

import org.apache.fury.*;
import org.apache.fury.config.*;
import org.apache.fury.serializer.BufferObject;
import org.apache.fury.memory.MemoryBuffer;

import java.util.*;
import java.util.stream.Collectors;

public class ZeroCopyExample {
  // Note that fury instance should be reused instead of creation every time.
  static Fury fury = Fury.builder()
    .withLanguage(Language.JAVA)
    .build();

  // mvn exec:java -Dexec.mainClass="io.ray.fury.examples.ZeroCopyExample"
  public static void main(String[] args) {
    List<Object> list = Arrays.asList("str", new byte[1000], new int[100], new double[100]);
    Collection<BufferObject> bufferObjects = new ArrayList<>();
    byte[] bytes = fury.serialize(list, e -> !bufferObjects.add(e));
    bufferObjects.
      .forEach(buf -> buf.writeTo(...)).collect(Collectors.toList());

    System.out.println(fury.deserialize(bytes, buffers));
  }
}

chaokunyang avatar Nov 26 '24 16:11 chaokunyang