xp
xp copied to clipboard
Performance - Cache after unserialization
Caching of blob should be done after the unserialization instead
NB! Remember we still need to cache the regular blobs (not node data) as we currently do.
Yes not for binary blobs, but for all the other blobs, we would gain from caching them after their unserialization
How much work is this?
Not much. I think this is a quick gain.
Could be your final commit then :-)
What is the reason for this? Do we have performance issues now?
This could potentially give significant performance improvements with minimal efforts. A typical XP request may handle between 50-1000 such unserializations, so if they can be pretty much avoided it should provide quite measurable improvement.
Sorry for being picky.
could potentially
This sounds premature. Time to re-reed Computer Programming as an Art (1974) page 671 http://www.cs.bilkent.edu.tr/~canf/knuth1974.pdfl
measurable improvement
Not without measurement tool.
Just run/time the unserialization 1000 times, and see how long it takes. If it takes less than 1ms we can drop it for now.
Short version: it takes 0.2ms
Long version:
# JMH version: 1.23
# VM version: JDK 11.0.7, OpenJDK 64-Bit Server VM, 11.0.7+10
# Warmup: 10 iterations, 1 s each
# Measurement: 10 iterations, 1 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: com.enonic.xp.repo.NodeVersionDeserializationTest.nodeVersionService_get
# Warmup Iteration 1: 0.618 ms/op
# Warmup Iteration 2: 0.249 ms/op
# Warmup Iteration 3: 0.223 ms/op
# Warmup Iteration 4: 0.226 ms/op
# Warmup Iteration 5: 0.216 ms/op
# Warmup Iteration 6: 0.212 ms/op
# Warmup Iteration 7: 0.210 ms/op
# Warmup Iteration 8: 0.205 ms/op
# Warmup Iteration 9: 0.204 ms/op
# Warmup Iteration 10: 0.205 ms/op
Iteration 1: 0.207 ms/op
Iteration 2: 0.208 ms/op
Iteration 3: 0.202 ms/op
Iteration 4: 0.202 ms/op
Iteration 5: 0.202 ms/op
Iteration 6: 0.202 ms/op
Iteration 7: 0.204 ms/op
Iteration 8: 0.203 ms/op
Iteration 9: 0.203 ms/op
Iteration 10: 0.204 ms/op
Benchmark Mode Cnt Score Error Units
NodeVersionDeserializationTest.nodeVersionService_get avgt 10 0.204 ± 0.003 ms/op
So, is this per node? So if I fetch 100 nodes, it is 2ms, if I fetch 1000 it's 20ms?
it is quite linear
Benchmark Mode Cnt Score Error Units
NodeVersionDeserializationTest.nodeVersionService_get1000 avgt 10 210.356 ± 1.914 ms/op
Test is for a different method, which operates on NodeVersionKeys (a Set of NodeVersionKey) but it doesn't affect the benchmark much apparently.
sorry, I was testing pure filesystem performance. In production there is blobkey cache already.
Benchmark Mode Cnt Score Error Units
NodeVersionDeserializationTest.nodeVersionService_get1000 avgt 10 17.980 ± 0.360 ms/op
result for a single node
Benchmark Mode Cnt Score Error Units
NodeVersionDeserializationTest.nodeVersionService_get avgt 10 0.016 ± 0.001 ms/op
so, if there is a request that gets 1000 nodes it will spend 18 ms just parsing them.