xp icon indicating copy to clipboard operation
xp copied to clipboard

Performance - Cache after unserialization

Open GlennRicaud opened this issue 5 years ago • 13 comments

Caching of blob should be done after the unserialization instead

GlennRicaud avatar Feb 27 '20 13:02 GlennRicaud

NB! Remember we still need to cache the regular blobs (not node data) as we currently do.

sigdestad avatar Feb 27 '20 16:02 sigdestad

Yes not for binary blobs, but for all the other blobs, we would gain from caching them after their unserialization

GlennRicaud avatar Feb 28 '20 08:02 GlennRicaud

How much work is this?

sigdestad avatar Feb 28 '20 09:02 sigdestad

Not much. I think this is a quick gain.

GlennRicaud avatar Feb 28 '20 09:02 GlennRicaud

Could be your final commit then :-)

sigdestad avatar Feb 28 '20 10:02 sigdestad

What is the reason for this? Do we have performance issues now?

rymsha avatar Mar 02 '20 11:03 rymsha

This could potentially give significant performance improvements with minimal efforts. A typical XP request may handle between 50-1000 such unserializations, so if they can be pretty much avoided it should provide quite measurable improvement.

sigdestad avatar Mar 02 '20 16:03 sigdestad

Sorry for being picky.

could potentially

This sounds premature. Time to re-reed Computer Programming as an Art (1974) page 671 http://www.cs.bilkent.edu.tr/~canf/knuth1974.pdfl

measurable improvement

Not without measurement tool.

rymsha avatar Mar 02 '20 18:03 rymsha

Just run/time the unserialization 1000 times, and see how long it takes. If it takes less than 1ms we can drop it for now.

sigdestad avatar Mar 03 '20 06:03 sigdestad

Short version: it takes 0.2ms

Long version:

# JMH version: 1.23
# VM version: JDK 11.0.7, OpenJDK 64-Bit Server VM, 11.0.7+10
# Warmup: 10 iterations, 1 s each
# Measurement: 10 iterations, 1 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: com.enonic.xp.repo.NodeVersionDeserializationTest.nodeVersionService_get
# Warmup Iteration   1: 0.618 ms/op
# Warmup Iteration   2: 0.249 ms/op
# Warmup Iteration   3: 0.223 ms/op
# Warmup Iteration   4: 0.226 ms/op
# Warmup Iteration   5: 0.216 ms/op
# Warmup Iteration   6: 0.212 ms/op
# Warmup Iteration   7: 0.210 ms/op
# Warmup Iteration   8: 0.205 ms/op
# Warmup Iteration   9: 0.204 ms/op
# Warmup Iteration  10: 0.205 ms/op
Iteration   1: 0.207 ms/op
Iteration   2: 0.208 ms/op
Iteration   3: 0.202 ms/op
Iteration   4: 0.202 ms/op
Iteration   5: 0.202 ms/op
Iteration   6: 0.202 ms/op
Iteration   7: 0.204 ms/op
Iteration   8: 0.203 ms/op
Iteration   9: 0.203 ms/op
Iteration  10: 0.204 ms/op
Benchmark                                              Mode  Cnt  Score   Error  Units
NodeVersionDeserializationTest.nodeVersionService_get  avgt   10  0.204 ± 0.003  ms/op

rymsha avatar Apr 22 '20 11:04 rymsha

So, is this per node? So if I fetch 100 nodes, it is 2ms, if I fetch 1000 it's 20ms?

sigdestad avatar Apr 22 '20 11:04 sigdestad

it is quite linear

Benchmark                                              Mode  Cnt    Score   Error  Units
NodeVersionDeserializationTest.nodeVersionService_get1000  avgt   10  210.356 ± 1.914  ms/op

Test is for a different method, which operates on NodeVersionKeys (a Set of NodeVersionKey) but it doesn't affect the benchmark much apparently.

rymsha avatar Apr 22 '20 11:04 rymsha

sorry, I was testing pure filesystem performance. In production there is blobkey cache already.

Benchmark                                                 Mode  Cnt   Score   Error  Units
NodeVersionDeserializationTest.nodeVersionService_get1000  avgt   10  17.980 ± 0.360  ms/op

result for a single node

Benchmark                                                  Mode  Cnt  Score   Error  Units
NodeVersionDeserializationTest.nodeVersionService_get  avgt   10  0.016 ± 0.001  ms/op

so, if there is a request that gets 1000 nodes it will spend 18 ms just parsing them.

rymsha avatar Apr 22 '20 11:04 rymsha