BlazingChain
BlazingChain copied to clipboard
Can't Serialize
Greetings! First of all, big thank you for providing this library. I do use it very successfully already at JSQLFormatter for the URL parameters: http://jsqlformatter.manticore-projects.com/jsqlformatter/demo.html?args=-c%20MoUQMiDCAqAEsEYBQ9WoDSwExIGICUB5AWVgBMBXAQwBskB1ACRHxFitgF5YAjJAbiA
Now I also wanted to use it from serializing Java Objects into XML and I was expecting the following code to work:
public static String encodeObject(Object object) throws IOException {
ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream();
ObjectOutput objectOutput= new ObjectOutputStream(byteArrayOutputStream);
objectOutput.writeObject(object);
objectOutput.flush();
objectOutput.close();
byteArrayOutputStream.flush();
String s = new String(byteArrayOutputStream.toByteArray());
return LZSEncoding.compressToBase64(s);
}
According to my understanding, this would give a Base64 encoded String which I can write into the XML.
Writing of course works well, however I get a Corrupted Stream message when De-Serializing.
Odd enough, the following code works around that problem:
public static String encodeObject(Object object) throws IOException {
ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream();
ObjectOutput objectOutput= new ObjectOutputStream(byteArrayOutputStream);
objectOutput.writeObject(object);
objectOutput.flush();
objectOutput.close();
byteArrayOutputStream.flush();
String s = Base64.getEncoder().encodeToString(byteArrayOutputStream.toByteArray());
return LZSEncoding.compressToBase64(s);
}
My question is: where is my understanding wrong and why is Base64 getting the encoding correct and LZSEnconding does not?
What am I missing here please?
Sorry to bother you, I figured it that Serialization works only with StandardCharsets.ISO_8859_1.
However, now I am confused even more: Plain Base64 Encoder returns shorter Strings than LZSEncoder?!
@Test
public void testSerialization() throws IOException, ClassNotFoundException {
Object object = new BigDecimal("2345.287272");
ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream();
ObjectOutput objectOutput= new ObjectOutputStream(byteArrayOutputStream);
objectOutput.writeObject(object);
objectOutput.flush();
objectOutput.close();
byteArrayOutputStream.flush();
String serializedObjectStr = new String(byteArrayOutputStream.toByteArray(), StandardCharsets.ISO_8859_1);
String lzsEncodedBase64 = LZSEncoding.compressToBase64( serializedObjectStr );
String base64Encoded = Base64.getEncoder().encodeToString(byteArrayOutputStream.toByteArray());
// Why is Base64 Encoder more efficient?!
System.out.println(serializedObjectStr + "\n"
+ lzsEncodedBase64 + "\n"
+ base64Encoded);
// verify Base64 Encoder
byte[] bytes = serializedObjectStr.getBytes(StandardCharsets.ISO_8859_1);
ByteArrayInputStream byteArrayInputStream = new ByteArrayInputStream(bytes);
ObjectInputStream objectInputStream = new ObjectInputStream(byteArrayInputStream);
Assertions.assertEquals(object, objectInputStream.readObject());
objectInputStream.close();
byteArrayInputStream.close();
// verify LZSEncoder
bytes = LZSEncoding.decompressFromBase64(lzsEncodedBase64).getBytes(StandardCharsets.ISO_8859_1);
byteArrayInputStream = new ByteArrayInputStream(bytes);
objectInputStream = new ObjectInputStream(byteArrayInputStream);
Assertions.assertEquals(object, objectInputStream.readObject());
objectInputStream.close();
byteArrayInputStream.close();
}
Base64Encoder is likely more efficient because you can't continue to compress a sequence of bits and continue to see smaller and smaller sizes; the serialized representation of a BigDecimal is probably not easily compressible (it may lack repetitive bit sequences). I didn't write the LZ-String algorithm, so I don't know the exact amount of overhead it requires, but some small amount of data needs to be in the serialized String so it can be deserialized correctly. That overhead might push Blazing-Chain to a larger size, though I'll run the tests you provided and see how much bigger. The Limpel-Ziv compression schemes, which LZ-String descends from, are dictionary compression algorithms, and compress commonly found "words" or "phrases" (in bits or bytes) to shorter representations than rarely-encountered ones.
The output of that latest test, for convenience:
Checking data: 2345.287272
¬í sr java.math.BigDecimalTÇWù(O I scaleL intValt Ljava/math/BigInteger;xr java.lang.Number¬à xp sr java.math.BigIntegerü©;û IbitCountI bitLengthI firstNonzeroByteNumI lowestSetBitI signum[ magnitudet [Bxq ~ ÿÿÿÿÿÿÿÿÿÿÿþÿÿÿþ ur [B¬óøTà xp Ê>hxx
Has length 294
DUW4AAoAzgTmAoArAhgN2QOgLbIC4AsMAhASwHMARAUwGMScAbAFQHGBUAdQE+BAgCgDyAYDAAgAJKQoNZAyoAZMADASAO1wA1WbjAA0eSnQB6HASOky49VTJUYAbgAecAAiHMDZKrIYAcgFcsACM7ADDgAFSAXABoABSAA4Bo0TAwRwAHNOVYBHdsPEILK1wbOwAYgB+AfIB8AEr7AG+okSVJAAggklwAYQB7f3VJAEhu3HkqbwJJAGQAMxIYKFxfftUALzt+ogBPUoCsSQAYBn6AdyoVgGUqXFJcSSUoclVAgG0wEZwyVR7/AAmdzE7yIjgAjmAAH5iAD/8IRiNhAD+EcjsgBAfxwUSg4AAZ4A6AAPpQdJgJVLpLJpAAgSQAUwA+fCORxAA===
Has length 400
rO0ABXNyABRqYXZhLm1hdGguQmlnRGVjaW1hbFTHFVf5gShPAwACSQAFc2NhbGVMAAZpbnRWYWx0ABZMamF2YS9tYXRoL0JpZ0ludGVnZXI7eHIAEGphdmEubGFuZy5OdW1iZXKGrJUdC5TgiwIAAHhwAAAABnNyABRqYXZhLm1hdGguQmlnSW50ZWdlcoz8nx+pO/sdAwAGSQAIYml0Q291bnRJAAliaXRMZW5ndGhJABNmaXJzdE5vbnplcm9CeXRlTnVtSQAMbG93ZXN0U2V0Qml0SQAGc2lnbnVtWwAJbWFnbml0dWRldAACW0J4cQB+AAL///////////////7////+AAAAAXVyAAJbQqzzF/gGCFTgAgAAeHAAAAAEi8o+aHh4
Has length 392
Checking data: [2345.287272, 23452.87272, 234528.7272, 2345287.272]
¬í sr java.util.Arrays$ArrayListÙ¤<¾ÍÒ [ at [Ljava/lang/Object;xpur [Ljava.math.BigDecimal;Hókr7 < xp sr java.math.BigDecimalTÇWù(O I scaleL intValt Ljava/math/BigInteger;xr java.lang.Number¬à xp sr java.math.BigIntegerü©;û IbitCountI bitLengthI firstNonzeroByteNumI lowestSetBitI signum[ magnitudet [Bxq ~ ÿÿÿÿÿÿÿÿÿÿÿþÿÿÿþ ur [B¬óøTà xp Ê>hxxsq ~ sq ~ ÿÿÿÿÿÿÿÿÿÿÿþÿÿÿþ uq ~ Ê>hxxsq ~ sq ~ ÿÿÿÿÿÿÿÿÿÿÿþÿÿÿþ uq ~ Ê>hxxsq ~ sq ~ ÿÿÿÿÿÿÿÿÿÿÿþÿÿÿþ uq ~ Ê>hxx
Has length 563
DUW4AAoAzgTmBYArAhgN2QOgK4BcCWANhgIIwzICeUAJKeRQDJ5Q4CbAJQDwB9AswBEAwAEsAgMAEAA2pOQ4wAZCkMU6APQFkAOwDmagPIAjRAFMAxjgDcADwAOWOAHRlqzAFs5ACwwAhPDoARczwPAksACQBngGsYAGiAdgBITnEwOzBMgBBYMAAUVwwPHG8/QODQgBUAcYBUAHUAT4BAgAp9AGAwUQBJSCgzZAITBjBBPC0cADUh+QA0FTRkNWLPNTKeyZMdExgbOAAEQs1dDAA5LDdDXYAw4ABUgFwAaAAUgAO4tIzMwVyCpZFLy+fybHDbXYAGIAPwB8gD4AErLABvx5dQR9AAQhjwOAAwgB7LCTPpJXE4BgmXQlPoKABmeBgLDOhK0AC9doSfBRwRc3H0ADAEQkAdxMLAAyiYcH4cH0/v4tJcZEkPDotHisAATGXdKQ+awARzAAD8wABwAD/NttdqtAD/bQ7MpIHPqfMBIo4AB+CLGVd7fWyurJxABTAD5PNZrFATeaIK7oAmwEl7RmnTaXZkJFhU4LQxHo7H42bIKGy+b0xm7VnHa68wWi1GY3HU0nMh0q2na3XnY38+XC9li22gA=
Has length 612
rO0ABXNyABpqYXZhLnV0aWwuQXJyYXlzJEFycmF5TGlzdNmkPL7NiAbSAgABWwABYXQAE1tMamF2YS9sYW5nL09iamVjdDt4cHVyABdbTGphdmEubWF0aC5CaWdEZWNpbWFsO0jza3KLNwk8AgAAeHAAAAAEc3IAFGphdmEubWF0aC5CaWdEZWNpbWFsVMcVV/mBKE8DAAJJAAVzY2FsZUwABmludFZhbHQAFkxqYXZhL21hdGgvQmlnSW50ZWdlcjt4cgAQamF2YS5sYW5nLk51bWJlcoaslR0LlOCLAgAAeHAAAAAGc3IAFGphdmEubWF0aC5CaWdJbnRlZ2VyjPyfH6k7+x0DAAZJAAhiaXRDb3VudEkACWJpdExlbmd0aEkAE2ZpcnN0Tm9uemVyb0J5dGVOdW1JAAxsb3dlc3RTZXRCaXRJAAZzaWdudW1bAAltYWduaXR1ZGV0AAJbQnhxAH4AB////////////////v////4AAAABdXIAAltCrPMX+AYIVOACAAB4cAAAAASLyj5oeHhzcQB+AAUAAAAFc3EAfgAJ///////////////+/////gAAAAF1cQB+AAwAAAAEi8o+aHh4c3EAfgAFAAAABHNxAH4ACf///////////////v////4AAAABdXEAfgAMAAAABIvKPmh4eHNxAH4ABQAAAANzcQB+AAn///////////////7////+AAAAAXVxAH4ADAAAAASLyj5oeHg=
Has length 752
Checking data: Arrays.asList(new BigDecimal("2345.287272"), new BigDecimal("23452.87272"), new BigDecimal("234528.7272"), new BigDecimal("2345287.272"))
’ t Arrays.asList(new BigDecimal("2345.287272"), new BigDecimal("23452.87272"), new BigDecimal("234528.7272"), new BigDecimal("2345287.272"))
Has length 144
DUW4AAoALmCRCCAnRBDAngZwHQowGQEsMoAKAOwFMB3AAgCECBzAEQoGMCBbFAGxICIATAGYALAFYsggBwB2QfP4BKADQ1KtBi3ZdeAkRMFY5CwcrUb6TVh258hY8TKzzFq9dSvbbeh4blSbkpAA
Has length 148
rO0ABXQAiUFycmF5cy5hc0xpc3QobmV3IEJpZ0RlY2ltYWwoIjIzNDUuMjg3MjcyIiksIG5ldyBCaWdEZWNpbWFsKCIyMzQ1Mi44NzI3MiIpLCBuZXcgQmlnRGVjaW1hbCgiMjM0NTI4LjcyNzIiKSwgbmV3IEJpZ0RlY2ltYWwoIjIzNDUyODcuMjcyIikp
Has length 192