msgpack-java icon indicating copy to clipboard operation
msgpack-java copied to clipboard

How to reduce the parsing time of Message pack library when unpacking large amount of data

Open KshitijGoMmt opened this issue 3 years ago • 11 comments

For an android app, unpacking the data from streaming apis form the received InputStream, the parsing takes way too long.

There is one single line of code that takes away around 600-800ms for data of size 16.65kb.

        ImmutableValue val = unpacker.unpackValue();
      **JSONObject jsonObj=new JOSNObject(val.toString());**

How can I make this faster? Its less than 50ms in iOS phone using MPMessagePack library which does the parsing in C language internally.

Here are the things I have already tried.

  1. Tried the wrapper lib https://github.com/msgpack/msgpack-java/blob/develop/msgpack-jackson/README.md that directly parses the data into models. Its equally slow. 2.Tried to write my own parsing but that doesn't seem to speed up either.
  2. Tried to traverse the whole ImmutableMapValue and putting key-values into JSONObject. That takes equal time as well.

Any idea how can we optimise this step?

KshitijGoMmt avatar Oct 22 '20 06:10 KshitijGoMmt

@komamitsu Can you please shed some light on this. Something that probably not being done correctly by us, or any other way to speed it up. It would be great if we can reach anywhere close to the iOS benchmark.

KshitijGoMmt avatar Oct 26 '20 09:10 KshitijGoMmt

        ImmutableValue val = unpacker.unpackValue();

The performance of parsing depends on source data and I didn't get what kind of data this unpacker contains. Could you give me reproducible code?

komamitsu avatar Oct 26 '20 09:10 komamitsu

Thanks for the quick reply. The line ImmutableValue val = unpacker.unpackValue(); works relatively quicker.

The issue is with this line majorly:

**JSONObject jsonObj=new JOSNObject(val.toString());**

And to be even more specific. Its with **val.toString()** method.

the val.toString() methods takes more than 600 ms for ~16kb of data.

Here is the time in millis for both the line of code:

2020-10-26 15:16:00.469 29582-2630/com.xxx.xdebug D/UnpackTime: Start: 524 ms, upacked to ImmutableValue 2020-10-26 15:16:00.469 29582-2630/com.xxxx.debug D/UnpackTime: Start: 1655 ms, ImmutableValue val to String

Let me ask if I can share the data since its private to the organisation I work for. If it's fine I will share in a while.

KshitijGoMmt avatar Oct 26 '20 10:10 KshitijGoMmt

Please use this URL to check the conversion from ImmutableValue to String *removed url since it was private to the organisation"

KshitijGoMmt avatar Oct 26 '20 10:10 KshitijGoMmt

I parsed the data with the following code and I couldn't reproduce the performance issue since it took only 60ms

        byte[] bytes = Files.readAllBytes(Paths.get("/Users/tmp/aaa/data"));
        long start;
        ImmutableValue value;
        {
            start = System.currentTimeMillis();
            MessageUnpacker unpacker = MessagePack.newDefaultUnpacker(bytes);
            value = unpacker.unpackValue();
            System.out.println(System.currentTimeMillis() - start);     >>> 60
        }
        {
            start = System.currentTimeMillis();
            String json = value.toString();
            System.out.println(System.currentTimeMillis() - start);     >>> 61
            System.out.println("The head of JSON data: " + json.substring(0, 12));    >>> "The head of JSON data: {"o":[{"id":"
            System.out.println("The length of JSON data: " + json.length());    >>> "The length of JSON data: 91807"
        }

BTW, JOSNObject isn't a part of msgpack-java and maybe you'd better ask the developer of the library about the performance of JOSNObject.

komamitsu avatar Oct 26 '20 11:10 komamitsu

First of all what machine are you using to parse it?

  1. Is it a desktop
  2. Or is it a mobile?

In my case I have been using this lib on Android mobile device and I suspect it could also be one possible reason for this mismatch in your timestamp logs and mine.

For these log I shared above, my code looked something like this:

               TimingLogger logger=new TimingLogger("UnpackTime","Start");

                String responseString = val.toString();
                logger.addSplit("ImmutableValue val to String");

                JSONObject object = new JSONObject(responseString);
                logger.addSplit(" String parsed to Platform JSON");

                logger.dumpToLog();

which results in this: 2020-10-26 15:08:39.934 29582-32399/com.xx.debug D/UnpackTime: Start: 762 ms, ImmutableValue val to String 2020-10-26 15:08:39.934 29582-32399/com.xx.debug D/UnpackTime: Start: 94 ms, String parsed to platform JSON

  1. So do you think on Android device could be a possible reason for this degradation in speed?
  2. Also can you tell me the machine's specs you used to parse this response?

3. Reading from byte[] was indeed faster. But since it was not from a file, reading from an InputStream to byte[] was giving me corrupt data. Do I need to give any specific encoding/decoding while converting from InputStream to byte[]???

KshitijGoMmt avatar Oct 26 '20 11:10 KshitijGoMmt

Ah, msgpack-java uses sun.misc.Unsafe for performance optimization and the class isn't supported in Android . So msgpack-java in Android uses non performance optimized implementation. https://github.com/msgpack/msgpack-java/blob/bef0ccdb9b1be70c533f81e0c459fdef6f578cd2/msgpack-core/src/main/java/org/msgpack/core/buffer/MessageBuffer.java#L87-L92 As you mentioned as question#1, I guess it may be related.

For 3, there are some options like https://stackoverflow.com/questions/1264709/convert-inputstream-to-byte-array-in-java

komamitsu avatar Oct 26 '20 12:10 komamitsu

@komamitsu I tried the suggested link and it worked fine. The issue still was in the same thing, converting the value to String. i.e. val.toString();

I was doing some research on this class and android sdk classes. I found some interesting things. 
Here are my observations and questions based on them:

  1. Tried and copied the Unsafe.java class file from java platform but it doesn’t seem to recognise some imports

  2. I found in android sdk also there is this Unsafe.java class present under the sun.misc package, Could this be used as a replacement for this native java platform Unsafe class? https://android.googlesource.com/platform/libcore/+/49965c1/ojluni/src/main/java/sun/misc/Unsafe.java If yes, for this also would I require to modify the lib code or can you provide us some way using configuration to set what class to be used for String parsing.

  3. Android has Kotlin as the official language now, does message-pack Kotlin provides this performance optimisation under the hood as Unsafe.java does in Java?

  4. Is there any other workaround that you think would be the best way to achieve this same level of performance optimisation as you have got in Java?

  5. In iOS library MPMessagePack I had seen they were doing this parsing in native objective C code and even the Unsafe.java seem to be using the native code (probably in C) to parse. We can use native code in android too. But is there any way we can handover the "parsing logic" block of code to native C code and get the results back after the parsing?

Would you like to connect over a call to discuss it further or debug it together and find a working solution that is as good as native java? Just a thought. Let me know.

KshitijGoMmt avatar Oct 26 '20 14:10 KshitijGoMmt

@komamitsu Please let me know your thoughts on my suggestions. Looking forward to it.

KshitijGoMmt avatar Oct 27 '20 12:10 KshitijGoMmt

1

msgpack-java detects the platform by checking some other other things not only the existence of Unsafe https://github.com/msgpack/msgpack-java/blob/bef0ccdb9b1be70c533f81e0c459fdef6f578cd2/msgpack-core/src/main/java/org/msgpack/core/buffer/MessageBuffer.java#L73-L92. I think you need to make System.getProperty("java.runtime.name") not to return Android at least. I'm not sure copying Unsafe works. though.

2

Java platforms can have different Unsafe implementations including different signatures. So it might work, but it might not work.

3

I'm not familiar with Kotlin implementation of MessagePack. But it might be worth trying.

4, 5

I have no idea for now...

komamitsu avatar Oct 27 '20 13:10 komamitsu

@komamitsu Sure I will give it a try and share the results if I find some workaround or any working solution. Thanks for all of your help and quick responses. I totally appreciate it. Thanks a lot.

KshitijGoMmt avatar Oct 27 '20 14:10 KshitijGoMmt