protostuff icon indicating copy to clipboard operation
protostuff copied to clipboard

inconsistent between list serialization and deserialization

Open musicaudience opened this issue 6 years ago • 2 comments

below is a snapshot of running result, and we can see the difference between list serialization and deserialization. any one could tell me the root cause? Thanks a ton. image

here is the code.

`import io.protostuff.LinkedBuffer; import io.protostuff.ProtostuffIOUtil; import io.protostuff.Schema; import io.protostuff.runtime.RuntimeSchema; import org.springframework.objenesis.Objenesis; import org.springframework.objenesis.ObjenesisStd; import java.util.ArrayList; import java.util.List; import java.util.Map; import java.util.concurrent.ConcurrentHashMap;

public class Test { public static void main(String[] args) { testMethod_01(); testMethod_02(); }

// the method seems ok.
private static void testMethod_01(){
    B b = new B();

    System.out.println("testMethod_02: before  serialize");
    for(String item : b.list()){
        System.out.println(item);
    }

    byte[] test=SerializationUtil.serialize(b);
    B newB=SerializationUtil.deserialize(test, B.class);

    System.out.println("testMethod_02: after  serialize");
    for(String item : newB.list()){
        System.out.println(item);
    }

    System.out.println();
}

// the method makes me frustrated, object does not keep consistent after deserialization.
private static void testMethod_02(){
    A a=new A(new B());

    System.out.println("testMethod_01: before  serialize");

    for(String item : a.get().list()){
        System.out.println(item);
    }

    byte[] test=SerializationUtil.serialize(a);

    A newA=SerializationUtil.deserialize(test, A.class);

    System.out.println("testMethod_01: after  serialize");
    for(String item : newA.get().list()){
        System.out.println(item);
    }

    System.out.println();
}

public static class A{
    B obj;

    public A(B obj) {
        this.obj = obj;
    }

    public B get() {
        return obj;
    }
}

public static class B {
    private List<String> list;

    public B() {
        list = new ArrayList<String>();
        list.add("item in list");
    }

    public List<String> list() {
        return list;
    }
}

/**
 * protostuf serializaion util.
 */
public static class SerializationUtil {
    private static Map<Class<?>, Schema<?>> cachedSchema = new ConcurrentHashMap<Class<?>, Schema<?>>();
    private static Objenesis objenesis = new ObjenesisStd(true);

    private static <T> Schema<T> getSchema(Class<T> cls) {
        Schema<T> schema = (Schema<T>) cachedSchema.get(cls);
        if (schema == null) {
            schema = RuntimeSchema.createFrom(cls);
            if (schema != null) {
                cachedSchema.put(cls, schema);
            }
        }
        return schema;
    }

    public static <T> byte[] serialize(T obj) {
        Class<T> cls = (Class<T>) obj.getClass();
        LinkedBuffer buffer = LinkedBuffer.allocate(LinkedBuffer.DEFAULT_BUFFER_SIZE);
        try {
            Schema<T> schema = getSchema(cls);
            return ProtostuffIOUtil.toByteArray(obj, schema, buffer);
        } catch (Exception e) {
            throw new IllegalStateException(e.getMessage(), e);
        } finally {
            buffer.clear();
        }
    }

    public static <T> T deserialize(byte[] data, Class<T> cls) {
        try {
            T message = (T) objenesis.newInstance(cls);
            Schema<T> schema = getSchema(cls);
            ProtostuffIOUtil.mergeFrom(data, message, schema);
            return message;
        } catch (Exception e) {
            throw new IllegalStateException(e.getMessage(), e);
        }
    }
}

}`

musicaudience avatar Jan 08 '18 08:01 musicaudience

@musicaudience I think there are 2 ways to solve the problem,

  1. Add @Morph annotation.
public static class B {
    @Morph
    private List<String> list;

    public B() {
      list = new ArrayList<String>();
      list.add("item in list");
    }

    public List<String> list() {
      return list;
    }
  }
  1. Enable morph capability globally.
static {
    System.setProperty("protostuff.runtime.collection_schema_on_repeated_fields", "true");
    System.setProperty("protostuff.runtime.morph_collection_interfaces", "true");
  }

or

static {
    System.setProperty("protostuff.runtime.pojo_schema_on_collection_fields", "true");
  }

@dyu I think both the solutions are workaround, because they are making the byte array bigger and not compatible with protobuf.

By enabling morph, the collection or map is created by factory method which is empty. By default for RuntimeRepeatedFieldFactory or RuntimeCollectionFieldFactory, they use reflection method to retrieve instance field, if the field is null, then it will try to create one, if the field is not null since it may either initiate by constructor or initiate by default value assignment, by merging the deser values, it seems those default values are added which is not expected.

The cause can be found from the below snippet:

RuntimeCollectionFieldFactory@88

@Override
            protected void mergeFrom(Input input, T message) throws IOException
            {
                accessor.set(message, input.mergeObject(
                        accessor.<Collection<Object>>get(message), schema));
            }

RuntimeRepeatedFieldFactory@67

@Override
            protected void mergeFrom(Input input, T message) throws IOException
            {
                final Object value = inline.readFrom(input);
                Collection<Object> existing = accessor.get(message);
                if (existing == null)
                    accessor.set(message, existing = messageFactory.newMessage());
                
                existing.add(value);
            }

Do you think it as expected? I have some thoughts because the method is called merge from which means we will retain the original settings of the object, so that makes sense. But the behavior becomes different when we use different options which is confusing.

I come up an idea by creating an annotation for field to indicate whether to create empty collection or map when deser. But that will introduce state which is hard to handle. How do you think about the solution?

neoremind avatar Jan 25 '19 07:01 neoremind

Because merging is a common use-case of protobuf, the behavior is always to merge an existing field.

If you are to purely use this for java serialization (no compatibility with protobuf), set the following to true:

protostuff.runtime.always_use_sun_reflection_factory
protostuff.runtime.morph_collection_interfaces
protostuff.runtime.morph_map_interfaces
protostuff.runtime.morph_non_final_pojos
protostuff.runtime.preserve_null_elements (activates protostuff.runtime.collection_schema_on_repeated_fields)

dyu avatar May 24 '20 11:05 dyu