fast-serialization icon indicating copy to clipboard operation
fast-serialization copied to clipboard

java.lang.NegativeArraySizeException

Open haochun opened this issue 9 years ago • 8 comments

java.lang.NegativeArraySizeException at java.util.Arrays.copyOf(Arrays.java:2271) at org.nustaq.serialization.util.FSTOutputStream.grow(FSTOutputStream.java:92) at org.nustaq.serialization.util.FSTOutputStream.ensureFree(FSTOutputStream.java:71) at org.nustaq.serialization.coders.FSTStreamEncoder.writeStringUTF(FSTStreamEncoder.java:302) at org.nustaq.serialization.FSTObjectOutput.writeObjectWithContext(FSTObjectOutput.java:392) at org.nustaq.serialization.FSTObjectOutput.writeObjectInternal(FSTObjectOutput.java:319) at org.nustaq.serialization.serializers.FSTMapSerializer.writeObject(FSTMapSerializer.java:48) at org.nustaq.serialization.FSTObjectOutput.writeObjectWithContext(FSTObjectOutput.java:455) at org.nustaq.serialization.FSTObjectOutput.writeObjectInternal(FSTObjectOutput.java:319) at org.nustaq.serialization.FSTObjectOutput.writeObject(FSTObjectOutput.java:284) at org.nustaq.serialization.FSTObjectOutput.writeObject(FSTObjectOutput.java:193) ... when i use fst to serialize HashMap<String,String>.i guess because the size of map is to big,

haochun avatar Jul 14 '15 05:07 haochun

but when i test one data in map,there is no exception.i guess the root cause is FST try to read the map to byte[].but the size of map is to large.

haochun avatar Jul 14 '15 05:07 haochun

Current limit of serialized object size is like 1.3GB (int range). Is your data that big ? In case of yes, you need to somehow split up the object graph and write separate smaller objects.

However its possible to hit the limit by wrong usage, so if your data is not expected to be that big I need some sample on how you use FST (and which version).

RuedigerMoeller avatar Jul 14 '15 07:07 RuedigerMoeller

My data is more than 1.3GB,i think it is this cause.I use ObjectOutpuStream of jdk to serialize it now.the file is 3GB,but,it is slow when i use ObjectOutputStream of jdk.how can i improve the limit?

haochun avatar Jul 14 '15 09:07 haochun

It's caused by int index limit, cannot fix that. You might try to split up the graph like:

myHugeMap.entrySet().forEach( entry -> {

     byte key[] = fstConf.asByteArray( entry.key() );
     byte value[] = fstConf.asByteArray( entry.value() );

     fileOutputStream.writeInt( key.length ); 
     fileOutputStream.write( key ); 

     fileOutputStream.writeInt( value.length ); 
     fileOutputStream.write( value ); 

});

RuedigerMoeller avatar Jul 14 '15 11:07 RuedigerMoeller

You mean that i should serialize my HashMap data each entry?

haochun avatar Jul 14 '15 13:07 haochun

Its a annoying workaround and only doable if your big hashmap is at top level. But it's impossible to fix arrays size limits. In theory i could flush my buffer once it grows too large, however as currently object references are marked hashed by index position, this will require major changes.

I'll mark this as enhancement for future releases.

RuedigerMoeller avatar Jul 14 '15 13:07 RuedigerMoeller

Hello, thanks for your very useful package!

If you are able to make this enhancement it would be very helpful to me as well. I am experiencing the same NegativeArraySizeException while serializing a very large hashmap used as a cache. I've attempted to write the data one element at a time like this, but the error persists:

FileOutputStream fileOut = new FileOutputStream(serialPath);
GZIPOutputStream zout = new GZIPOutputStream(new BufferedOutputStream(fileOut));
FSTObjectOutput out = new FSTObjectOutput(zout);

out.writeObject(new Integer(cache.size()),Integer.class );
for(String key : cache.keySet()){
    out.writeObject(key,String.class );
    out.writeObject(cache.get(key),TripInformation.class );
}

out.close();
zout.close();
fileOut.close();

My next workaround idea is to break the serialization up into separate files, but this is quite tedious. If you have other thoughts on a workaround, that would be much appreciated.

Thanks again!

colinsheppard avatar Nov 07 '16 17:11 colinsheppard

Same problem here with data to serialize of 4.6 GB. No problem with standard serialization, but your fst is so quick on smaller examples! I hope that it is possible to break the 1.3 limit. My problem is that the objects I want to serialize are not in my code, but in a third party and I don't wont to modify their code, or split everything.

It seems that Kryo has a solution for going over 1.3GB, but I still cannot make it work easily. Thanks! Mehdi

mehdi-kaytoue avatar May 03 '18 19:05 mehdi-kaytoue