jetty.project icon indicating copy to clipboard operation
jetty.project copied to clipboard

Fixes #11371 - Review ArrayByteBufferPool eviction.

Open sbordet opened this issue 1 year ago • 8 comments

  • Eviction is now performed on release(), rather than acquire().
  • Memory accounting is done on release(), rather than acquire(). This is because we were always exceeding the maxMemory on acquire(), by returning a non-pooled buffer. We only need to account for what is idle in the pool, and that is done more efficiently on release(), and it is leak-resistant (i.e. if the buffer is not returned, the memory is already non accounted for, keeping the pool consistent).
  • Released entries now give precedence to Concurrent.Entry, rather than Queued.Entry, so the queued pool is always kept at minimum size.
  • Eviction is performed by running through the buckets, starting at a random bucket, and removing, if present, an idle buffer until the excess memory is freed.

sbordet avatar Feb 13 '24 13:02 sbordet

JMH benchmark for ABBP.

Here, testing the impact of statistics for unconstrained ABBP:

With AtomicLong: 5.287 vs 7.017 Mops/s

ArrayByteBufferPoolBenchmark.inputFixedCapacityOutputRandomCapacity            4096               -1          65536            0              0                 true  thrpt   10  5287439.484 ±  53877.187  ops/s
ArrayByteBufferPoolBenchmark.inputFixedCapacityOutputRandomCapacity            4096               -1          65536            0              0                false  thrpt   10  7016694.751 ± 226393.784  ops/s

With LongAdder: 6.132 vs 7.044 Mops/s

ArrayByteBufferPoolBenchmark.inputFixedCapacityOutputRandomCapacity            4096               -1          65536            0              0                 true  thrpt   10  6132287.148 ±  65148.418  ops/s
ArrayByteBufferPoolBenchmark.inputFixedCapacityOutputRandomCapacity            4096               -1          65536            0              0                false  thrpt   10  7043554.688 ± 194914.916  ops/s

sbordet avatar Feb 15 '24 14:02 sbordet

JMH benchmark for ABBP.

Here, testing @sbordet evict algorithm (1 loop over all buckets starting at random index, skip same capacity bucket) vs @gregw algorithm (pick 1 random bucket, skip same capacity bucket, evict from bucket):

@sbordet's:

ArrayByteBufferPoolBenchmark.inputFixedCapacityOutputRandomCapacity            4096               -1          65536            0              0                 true  thrpt   10  9002854.971 ± 101884.524  ops/s
ArrayByteBufferPoolBenchmark.inputFixedCapacityOutputRandomCapacity            4096               -1          65536      1048576              0                 true  thrpt   10  4380148.459 ± 325418.623  ops/s

org.eclipse.jetty.io.ArrayByteBufferPool@26edac07{min=0,max=65536,buckets=16,heap=0/1048576,direct=1048576/1048576}
+> direct size=16
|  +> org.eclipse.jetty.io.ArrayByteBufferPool$RetainedBucket@30853dbf{capacity=4096,in-use=0/0,pooled/acquires=2457205/2659905(92.379%),non-pooled/evicts/removes/releases=5169/51188/197531/2659905}
|  +> org.eclipse.jetty.io.ArrayByteBufferPool$RetainedBucket@22ecfe8a{capacity=8192,in-use=0/11,pooled/acquires=45111533/45216172(99.769%),non-pooled/evicts/removes/releases=4556/2274284/100072/45216172}
|  +> org.eclipse.jetty.io.ArrayByteBufferPool$RetainedBucket@3ab1e25f{capacity=12288,in-use=0/0,pooled/acquires=2216231/2659083(83.346%),non-pooled/evicts/removes/releases=28963/49987/413889/2659083}
|  +> org.eclipse.jetty.io.ArrayByteBufferPool$RetainedBucket@23404873{capacity=16384,in-use=0/0,pooled/acquires=2229354/2661871(83.751%),non-pooled/evicts/removes/releases=38418/51400/394099/2661871}
|  +> org.eclipse.jetty.io.ArrayByteBufferPool$RetainedBucket@63b908c2{capacity=20480,in-use=0/0,pooled/acquires=2245768/2659630(84.439%),non-pooled/evicts/removes/releases=52350/52598/361512/2659630}
|  +> org.eclipse.jetty.io.ArrayByteBufferPool$RetainedBucket@2664a291{capacity=24576,in-use=0/3,pooled/acquires=2271665/2657337(85.487%),non-pooled/evicts/removes/releases=62376/53429/323293/2657337}
|  +> org.eclipse.jetty.io.ArrayByteBufferPool$RetainedBucket@13721f18{capacity=28672,in-use=0/3,pooled/acquires=2299657/2661455(86.406%),non-pooled/evicts/removes/releases=72219/54659/289576/2661455}
|  +> org.eclipse.jetty.io.ArrayByteBufferPool$RetainedBucket@32d0ded5{capacity=32768,in-use=0/1,pooled/acquires=2313063/2658056(87.021%),non-pooled/evicts/removes/releases=81802/54701/263190/2658056}
|  +> org.eclipse.jetty.io.ArrayByteBufferPool$RetainedBucket@19043f55{capacity=36864,in-use=0/1,pooled/acquires=2326751/2661217(87.432%),non-pooled/evicts/removes/releases=90917/55126/243548/2661217}
|  +> org.eclipse.jetty.io.ArrayByteBufferPool$RetainedBucket@1a7766ce{capacity=40960,in-use=0/1,pooled/acquires=2325787/2656036(87.566%),non-pooled/evicts/removes/releases=100901/55184/229347/2656036}
|  +> org.eclipse.jetty.io.ArrayByteBufferPool$RetainedBucket@19fe3047{capacity=45056,in-use=0/2,pooled/acquires=2331023/2661338(87.588%),non-pooled/evicts/removes/releases=111051/56061/219262/2661338}
|  +> org.eclipse.jetty.io.ArrayByteBufferPool$RetainedBucket@2d6c63db{capacity=49152,in-use=0/4,pooled/acquires=2325386/2659990(87.421%),non-pooled/evicts/removes/releases=123582/56956/211018/2659990}
|  +> org.eclipse.jetty.io.ArrayByteBufferPool$RetainedBucket@15aa1fb4{capacity=53248,in-use=0/2,pooled/acquires=2315380/2660379(87.032%),non-pooled/evicts/removes/releases=138286/56069/206711/2660379}
|  +> org.eclipse.jetty.io.ArrayByteBufferPool$RetainedBucket@344a5ad9{capacity=57344,in-use=0/3,pooled/acquires=2304223/2662086(86.557%),non-pooled/evicts/removes/releases=155233/56461/202627/2662086}
|  +> org.eclipse.jetty.io.ArrayByteBufferPool$RetainedBucket@5cde281f{capacity=61440,in-use=0/2,pooled/acquires=2283099/2658423(85.882%),non-pooled/evicts/removes/releases=175578/55870/199744/2658423}
|  +> org.eclipse.jetty.io.ArrayByteBufferPool$RetainedBucket@1262fae4{capacity=65536,in-use=0/0,pooled/acquires=2264663/2661456(85.091%),non-pooled/evicts/removes/releases=201161/55321/195632/2661456}

@gregw's:

ArrayByteBufferPoolBenchmark.inputFixedCapacityOutputRandomCapacity            4096               -1          65536            0              0                 true  thrpt   10  9097094.184 ± 146472.471  ops/s
ArrayByteBufferPoolBenchmark.inputFixedCapacityOutputRandomCapacity            4096               -1          65536      1048576              0                 true  thrpt   10  4559670.608 ± 376450.879  ops/s

org.eclipse.jetty.io.ArrayByteBufferPool@6f3b9201{min=0,max=65536,buckets=16,heap=0/1048576,direct=1011712/1048576}
+> direct size=16
|  +> org.eclipse.jetty.io.ArrayByteBufferPool$RetainedBucket@28583117{capacity=4096,in-use=0/2,pooled/acquires=2575760/2786198(92.447%),non-pooled/evicts/removes/releases=6450/57604/203986/2786198}
|  +> org.eclipse.jetty.io.ArrayByteBufferPool$RetainedBucket@2362f293{capacity=8192,in-use=0/10,pooled/acquires=46391093/47336444(98.003%),non-pooled/evicts/removes/releases=57604/2460704/887737/47336444}
|  +> org.eclipse.jetty.io.ArrayByteBufferPool$RetainedBucket@69551d3e{capacity=12288,in-use=0/1,pooled/acquires=2562090/2784290(92.020%),non-pooled/evicts/removes/releases=15129/60100/207070/2784290}
|  +> org.eclipse.jetty.io.ArrayByteBufferPool$RetainedBucket@ec598a{capacity=16384,in-use=0/2,pooled/acquires=2556711/2785834(91.775%),non-pooled/evicts/removes/releases=21557/60207/207564/2785834}
|  +> org.eclipse.jetty.io.ArrayByteBufferPool$RetainedBucket@48e32739{capacity=20480,in-use=0/2,pooled/acquires=2543488/2783059(91.392%),non-pooled/evicts/removes/releases=31322/60665/208247/2783059}
|  +> org.eclipse.jetty.io.ArrayByteBufferPool$RetainedBucket@7c21a2dc{capacity=24576,in-use=0/0,pooled/acquires=2533658/2782756(91.049%),non-pooled/evicts/removes/releases=41266/61963/207832/2782756}
|  +> org.eclipse.jetty.io.ArrayByteBufferPool$RetainedBucket@3ddcec8b{capacity=28672,in-use=0/3,pooled/acquires=2521220/2784393(90.548%),non-pooled/evicts/removes/releases=54347/61797/208823/2784393}
|  +> org.eclipse.jetty.io.ArrayByteBufferPool$RetainedBucket@5f1ed2b3{capacity=32768,in-use=0/2,pooled/acquires=2506601/2782996(90.068%),non-pooled/evicts/removes/releases=67510/61721/208883/2782996}
|  +> org.eclipse.jetty.io.ArrayByteBufferPool$RetainedBucket@e15ec86{capacity=36864,in-use=0/1,pooled/acquires=2494442/2785667(89.546%),non-pooled/evicts/removes/releases=82498/62708/208726/2785667}
|  +> org.eclipse.jetty.io.ArrayByteBufferPool$RetainedBucket@7fd388da{capacity=40960,in-use=0/2,pooled/acquires=2480469/2785648(89.045%),non-pooled/evicts/removes/releases=97254/61994/207923/2785648}
|  +> org.eclipse.jetty.io.ArrayByteBufferPool$RetainedBucket@25318b1f{capacity=45056,in-use=0/1,pooled/acquires=2461254/2784967(88.376%),non-pooled/evicts/removes/releases=114958/61722/208754/2784967}
|  +> org.eclipse.jetty.io.ArrayByteBufferPool$RetainedBucket@c134e40{capacity=49152,in-use=0/1,pooled/acquires=2443864/2784654(87.762%),non-pooled/evicts/removes/releases=132877/62006/207912/2784654}
|  +> org.eclipse.jetty.io.ArrayByteBufferPool$RetainedBucket@2d43f298{capacity=53248,in-use=0/2,pooled/acquires=2423021/2781587(87.109%),non-pooled/evicts/removes/releases=151559/61289/207005/2781587}
|  +> org.eclipse.jetty.io.ArrayByteBufferPool$RetainedBucket@4e352ab2{capacity=57344,in-use=0/3,pooled/acquires=2407192/2788218(86.334%),non-pooled/evicts/removes/releases=174637/60987/206386/2788218}
|  +> org.eclipse.jetty.io.ArrayByteBufferPool$RetainedBucket@67f53918{capacity=61440,in-use=0/1,pooled/acquires=2378839/2784073(85.445%),non-pooled/evicts/removes/releases=199095/61405/206138/2784073}
|  +> org.eclipse.jetty.io.ArrayByteBufferPool$RetainedBucket@71820ae8{capacity=65536,in-use=0/2,pooled/acquires=2352840/2784936(84.485%),non-pooled/evicts/removes/releases=227259/61391/204835/2784936}

@sbordet's is slightly worse in throughput, but better at hitRatio on the buffers actually in use.

For larger input buffers (the 8 KiB above), @sbordet's keeps a high hitRatio, but @gregw's become way worse. For example, with 60 KiB input buffer, @sbordet's hitRatio on that buffer is 99.94% at 3045807 ops/s, while @gregw's hitRatio is 84.9% at 1642158 ops/s. The buffer currently released is almost always pooled in @sbordet's, but almost always discarded in @gregw's, causing re-allocation (whose main cost is zeroing the buffer).

sbordet avatar Feb 15 '24 15:02 sbordet

@sbordet interesting! So it is worthwhile spending some time looping to find a bucket that can be evicted so that the just released buffer can be pooled. I wonder how my most idle bucket algorithm would have gone?

I've got a couple of ideas for improving the loop to make it less biased... stand by....

gregw avatar Feb 16 '24 07:02 gregw

@sbordet I think we are really close now... and probably over-analysing beyond the fidelity of JMH.... but we are here now, so let's finish...

Your JMH test is OK, but by spreading the random writes over the full range, it is not like what we have seen with many voids in the buckets, that will affect the bias of the current algorithm. So I've made an attempt to alter the JMH so that it uses a fixed set of random write sizes, thus there should be some empty buckets.

I've also done two other eviction algorithms:

  • evictUnbiasedRandom, that needs to iterate twice so it can count the non empty buckets (of course that is racy but it is an attempt).
  • evictLargest, my original algorithm to find the largest bucket, which of course does nested iterations, but only on non-empty buckets

For some reason, my setup is not able to run the JMHs at the moment (plus I've only got my pathetic laptop). So can you run these to compare.

Once you have the numbers, let's hangout with @lorban and pick one.

gregw avatar Feb 16 '24 09:02 gregw

@sbordet my changes are in https://github.com/jetty/jetty.project/tree/fix/jetty-12/11371/review-abbp-eviction3 I can push to your branch if you like?

gregw avatar Feb 16 '24 09:02 gregw

Your JMH test is OK, but by spreading the random writes over the full range, it is not like what we have seen with many voids in the buckets, that will affect the bias of the current algorithm.

What we have seen so far were CometD benchmarks over HTTP/2, using ServletOutputStream.

The moment we use a Writer for HTTP, or we use WebSocket, the range of buffers used will depend entirely on the application writes, which may be clustered around certain sizes, but we do not know.

sbordet avatar Feb 16 '24 09:02 sbordet

The moment we use a Writer for HTTP, or we use WebSocket, the range of buffers used will depend entirely on the application writes, which may be clustered around certain sizes, but we do not know.

We do know that many frameworks like cometd, servlets, etc. will impose some level of buffering/aggregation before the writes, so quantization of sizes is definitely a common use case, but perhaps not the only one. Hence my JMH modifications make the writeDiversity a parameter that we can adjust to see the different effects of voids versus no voids.

As I said, we are probably close to beyond the fidelity of JMH to simulate reality, but modelling the main use case that we have seen is certainly something we should do.

gregw avatar Feb 16 '24 09:02 gregw

Agreed, I'll prepare some number for this afternoon.

sbordet avatar Feb 16 '24 09:02 sbordet