powsybl-core
powsybl-core copied to clipboard
JSON deserializers performance improvement
- Do you want to request a feature or report a bug?
Performance improvement
- What is the current behavior?
Following merge of https://github.com/powsybl/powsybl-core/pull/2445, I performed some performance measurements with JMH to assess the impact on serialization and parsing performances.
The test case is a contingency list with around 10000 branch contingencies, defined as a DefaultContingencyList
.
Before:
Benchmark (contingencyListPath) Mode Cnt Score Error Units
ParsingBenchmark.parsing /home/leclercsyl/tmp/branch-contingencies.json ss 500 11,748 ± 0,469 ms/op
ParsingBenchmark.writing /home/leclercsyl/tmp/branch-contingencies.json ss 500 44,966 ± 0,528 ms/op
After:
Benchmark (contingencyListPath) Mode Cnt Score Error Units
ParsingBenchmark.parsing /home/leclercsyl/tmp/branch-contingencies.json ss 500 20,757 ± 1,777 ms/op
ParsingBenchmark.writing /home/leclercsyl/tmp/branch-contingencies.json ss 500 13,827 ± 0,484 ms/op
We can see that performances of serialization have been greatly improved, while on the other side, and unexpectedly, the performances of parsing have decreased.
After some digging, it seems that jsonParser.readValueAs
and ctxt.readValue
have slightly different initialization paths, which can explain the difference.
However, the main outcome is that what is costly, in this use case, is the resolution of the deserializers every time a new Contingency
object is parsed.
When you look at jackson implementation, it actually offers some mechanism to perform this resolution only once, through the ResolvableDeserializer
and ContextualDeserializer
interfaces. See for example CollectionDeserializer
, which contains an actual reference to the underlying deserializer for values inside the collection.
I tried to implement ResolvableDeserializer
for ContingencyDeserializer
in order to resolve only once the deserializer of contingency elements:
@Override
public void resolve(DeserializationContext ctxt) throws JsonMappingException {
JavaType elementsType = ctxt.getConfig().constructType(new TypeReference<ArrayList<ContingencyElement>>() {
});
elementsDeser = super.findDeserializer(ctxt, elementsType, null);
}
Indeed, parsing time gets greatly improved:
Benchmark (contingencyListPath) Mode Cnt Score Error Units
ParsingBenchmark.parsing /home/leclercsyl/tmp/branch-contingencies.json ss 500 9,276 ± 0,343 ms/op
Similar performance is achieved by implementing ContextualDeserializer
:
@Override
public JsonDeserializer<?> createContextual(DeserializationContext ctxt, BeanProperty property) throws JsonMappingException {
JavaType elementsType = ctxt.getConfig().constructType(new TypeReference<ArrayList<ContingencyElement>>() {
});
return new ContingencyDeserializer(super.findDeserializer(ctxt, elementsType, null));
}
gives:
Benchmark (contingencyListPath) Mode Cnt Score Error Units
ParsingBenchmark.parsing /home/leclercsyl/tmp/branch-contingencies.json ss 500 9,169 ± 0,361 ms/op
Note:
Maybe implementing ContextualDeserializer
would be a better option, it creates a new instance instead of modifying the existing one.
Conclusion Conclusion is that our deserializers could be greatly improved performance-wise, by following one of the 2 schemes above. However, this will require some work to get it done for all fields of all classes !
- What is the motivation / use case for changing the behavior?
Performance
-
Please tell us about your environment:
- PowSyBl Version: 5.2.0-SNAPSHOT
- OS Version: Ubuntu 20.04
Another, simpler approach to be tested:
the main bottleneck in current implementation seems to be that CollectionDeserializer
is not cacheable.
We could replace its use by a raw parser-based implementation of list deserialization, in JsonUtil.readList
, which would have the advantage of benefiting all our custom deserializers.
See current benchmark code on branch parsing-benchmark, for further testing.