Fixed #334: Sort attributes before computing the hashcode
Sort attributes by attribute key
You are right, in getAttributesFiltered there's no real need for sorted attributes and TreeMap is safer than ConcurrentHashMap .
Add a new method int getHashCodeByAttributes(ConcurrentMap attributes) when get hashcode is the best way to solute this question.
But how to get hashcode? Convert ConcurrentHashMap to TreeMap and Serialize the TreeMap again ?
Without further checking I'd think that instead of final ConcurrentMap<String, Object> attributes = _session.getAttributesFiltered(); this would be sufficient: final Map<String, Object> attributes = new TreeMap(_session.getAttributesFiltered());.
I meant i don't know how to get hashcode by sorted attributes. Can you give me some advice?
Can't you just pass the sorted attributes to serializeAttributes to get back attributesData?
ConcurrentMap is be required in serializeAttributes method, if the attributes change to Map , serializeAttributes method is need to modify?
You're right, we'd have to change serializeAttributes and everything "below" (descendent methods) to accept a Map. We'd also have to make sure that this sorted map is actually serialized, therefore it might be a good idea to change serializeAttributes etc to accept a SortedMap to make the contract more explicit (alternatively the javadoc should mention that SortedMaps must be handled accordingly)
I stepped a bit through the code, and AFAICS the only msm external modification is needed for kryo-serializer's CopyForIterateMapSerializer, which the KryoTranscoder uses if msm is configured with copyCollectionsForSerialization=true. The CopyForIterateMapSerializer would create a new HashMap from the provided map and therefore would not have entries ordered deterministically.
The JavolutionTranscoder and XStreamTranscoder differ from other transcoders in that they serialize the class name of the attributes map (previously ConcurrentHashMap, then TreeMap), which means that these transcoders have to deserialize the original map and create a ConcurrentHashMap from the deserialized map.
We could also change deserializeAttributes to return a Map only, then implementations have more freedom. The TranscoderService.deserialize method could then check if a ConcurrentMap was returned from deserializeAttributes and if not create a ConcurrentHashMap from the returned map.
So the change seems to be a bit more involved... ;-)
This is my worry. Are there any other solutions?
I've not yet clearly understood the original observation: why the same session data was serialized differently. If we know what exactly caused this there might be another solution or workaround.
Another solution could be to use a ConcurrentSkipListMap for attributes instead of ConcurrentHashMap. This would require also some changes to replace previous usages of CHM, but it should be significantly fewer.
There is a test case:
public class MapSerializerTest {
private Kryo _kryo;
@BeforeTest
public void setUp() throws Exception {
_kryo = new Kryo();
MapSerializer.registerSerializers(_kryo);
}
@Test
public void testDeserialize() throws Exception {
ConcurrentMap<String ,Object> obj = new ConcurrentHashMap<String ,Object>();
obj.put("iccid","12345");
obj.put("cityCode","heibei");
final byte[] serializedByObj = serialize(_kryo, obj);
final ConcurrentMap deserialized = deserialize(_kryo, serializedByObj, ConcurrentHashMap.class);
// that is ok
assertEquals(obj,deserialized);
final byte[] serializedByDeserialized = serialize(_kryo, deserialized);
// serializedByObj not equals serializedByDeserialized
assertNotEquals(serializedByObj,serializedByDeserialized);
// copy deserialized obj to a new obj
ConcurrentMap<String ,Object> obj2 = new ConcurrentHashMap<String ,Object>();
for (Object key : deserialized.keySet()) {
obj2.put((String) key,deserialized.get(key));
}
assertEquals(obj2,deserialized);
// serialize obj2
final byte[] serializedByObj2 = serialize(_kryo, obj2);
// compare serializedByObj with serializedByObj2
assertEquals(serializedByObj,serializedByObj2);
}
}
I guess the problem caused by the deserialized obj, so i used a copy of deserialized attributes to compute hascode in my solution.
Sorry, I can't follow you. Can you please rephrase what you want to say?
The same data serialized differently. Maybe it's because of that the deserialized data is not deep equals with the original data. When I run this test:
@Test
public void testDeserialize() throws Exception {
ConcurrentMap<String ,Object> obj = new ConcurrentHashMap<String ,Object>();
obj.put("iccid","12345");
obj.put("cityCode","heibei");
final byte[] serializedByObj = de.javakaffee.kryoserializers.KryoTest.serialize(_kryo, obj);
final ConcurrentMap deserialized = de.javakaffee.kryoserializers.KryoTest.deserialize(_kryo, serializedByObj, ConcurrentHashMap.class);
de.javakaffee.kryoserializers.KryoTest.assertEquals(obj,deserialized);
de.javakaffee.kryoserializers.KryoTest.assertDeepEquals(obj,deserialized);
}
I got the error:
java.lang.AssertionError: expected [iccid] but found [cityCode]
Expected :iccid
Actual :cityCode
That's what we/you observed originally, and therfore is expected, isn't it?
This should not happen with ConcurrentSkipListMap.
Yes, this not happen with ConcurrentSkipListMap, but , we must to change more...
So, can we keep the same order to put attributes into ConcurrentHashMap?
When we put the attributes into ConcurrentHashMap with the different order of keys, we will get the different serialized data, sometimes we will get the same data also.
If we put the attributes into ConcurrentHashMap with the same order of keys. we will get the same serialized data, is it right?