pbdirect icon indicating copy to clipboard operation
pbdirect copied to clipboard

Deserialization performance is poor for large products

Open henryem opened this issue 6 years ago • 0 comments

Currently, as noted in the project page, we use an O(N*M) algorithm for deserializing products with N fields repeated a total of M times. In profiling the deserialization of a complex deeply-nested object, I found that we end up spending a huge amount of time in the resulting calls to CodedInputStream.skipField.

One way to eliminate this is to (1) create a map from field index to parser, (2) loop through fields, passing each to its appropriate parser, and (3) finally build the results of each parser. I have prepared a PR that does this, which I'll be submitting soon (once my previous PR, which is a dependency, is merged). My PR also reduces memory pressure for deserializing nested objects by eliminating allocation of extra byte buffers in all cases except coproduct parsing.

henryem avatar Jun 15 '19 02:06 henryem