goleveldb
goleveldb copied to clipboard
util: improve BufferPool, eliminate unexpected allocations
Using sync.Pool to manage bare slices is not the proper way, since it'll bring unexpected allocations. This PR solves this problem with small tricks and makes BufferPool a bit faster.
benchmarks:
$ benchstat before.txt after.txt
name old time/op new time/op delta
BufferPool-8 55.9ns ± 0% 34.8ns ± 0% -37.71% (p=0.008 n=5+5)
name old alloc/op new alloc/op delta
BufferPool-8 24.0B ± 0% 0.0B -100.00% (p=0.008 n=5+5)
name old allocs/op new allocs/op delta
BufferPool-8 1.00 ± 0% 0.00 -100.00% (p=0.008 n=5+5)
The buffer pool is now mainly for reusing block read/write buffers. After #367 merged, the baseline does not work well as wish. Last two commits simplify BufferPool and highly increase the reuse rate of pooled buffers.
The raw benchmark output files: master.txt, pr.txt
Benchstat:
$ benchstat master.txt pr.txt
name old time/op new time/op delta
DefaultBatchWrite-8 19.8ms ± 0% 19.8ms ± 0% ~ (p=0.690 n=5+5)
FastAllocationBatchWrite-8 6.35ms ± 3% 6.35ms ± 2% ~ (p=1.000 n=5+5)
DBWrite-8 2.29µs ± 0% 2.29µs ± 0% ~ (p=0.302 n=5+5)
DBWriteBatch-8 1.77µs ±10% 1.67µs ± 3% ~ (p=0.548 n=5+5)
DBWriteUncompressed-8 2.57µs ± 8% 2.56µs ±13% ~ (p=0.841 n=5+5)
DBWriteBatchUncompressed-8 2.21µs ±24% 1.98µs ±15% ~ (p=0.548 n=5+5)
DBWriteRandom-8 6.14µs ±10% 5.72µs ±13% ~ (p=0.310 n=5+5)
DBWriteRandomSync-8 8.42ms ±10% 7.58ms ±10% -10.05% (p=0.032 n=5+5)
DBOverwrite-8 4.96µs ±15% 4.55µs ±10% ~ (p=0.310 n=5+5)
DBOverwriteRandom-8 9.90µs ± 8% 9.36µs ± 6% ~ (p=0.151 n=5+5)
DBPut-8 2.45µs ± 9% 2.36µs ± 1% ~ (p=1.000 n=5+5)
DBRead-8 259ns ± 8% 246ns ± 5% ~ (p=0.056 n=5+5)
DBReadGC-8 272ns ± 0% 269ns ± 0% -1.07% (p=0.029 n=4+4)
DBReadUncompressed-8 242ns ± 0% 225ns ± 8% -6.67% (p=0.008 n=5+5)
DBReadTable-8 226ns ± 0% 218ns ± 1% -3.25% (p=0.008 n=5+5)
DBReadReverse-8 345ns ± 0% 344ns ± 0% -0.50% (p=0.016 n=5+5)
DBReadReverseTable-8 345ns ± 0% 343ns ± 0% -0.44% (p=0.016 n=5+5)
DBSeek-8 5.66µs ± 1% 5.67µs ± 1% ~ (p=0.841 n=5+5)
DBSeekRandom-8 8.37µs ± 1% 8.28µs ± 1% -1.01% (p=0.032 n=5+5)
DBGet-8 2.12µs ± 1% 2.16µs ± 0% +1.64% (p=0.029 n=4+4)
DBGetRandom-8 5.48µs ± 2% 5.36µs ± 0% -2.34% (p=0.016 n=5+4)
DBReadConcurrent-8 77.2ns ± 2% 81.6ns ± 4% +5.72% (p=0.008 n=5+5)
DBReadConcurrent2-8 89.4ns ± 2% 90.6ns ± 2% ~ (p=0.222 n=5+5)
GetOverlapLevel0-8 51.4ms ± 6% 52.4ms ± 3% ~ (p=0.548 n=5+5)
GetOverlapNonLevel0-8 653µs ± 2% 669µs ± 1% ~ (p=0.056 n=5+5)
VersionStagingNonTrivial-8 31.5ms ± 1% 31.4ms ± 0% ~ (p=0.841 n=5+5)
VersionStagingTrivial-8 3.96ms ± 3% 3.90ms ± 2% ~ (p=0.151 n=5+5)
name old alloc/op new alloc/op delta
DefaultBatchWrite-8 74.2MB ± 0% 74.2MB ± 0% ~ (all equal)
FastAllocationBatchWrite-8 28.7MB ± 0% 28.7MB ± 0% ~ (all equal)
DBWrite-8 73.0B ± 0% 73.0B ± 0% ~ (all equal)
DBWriteBatch-8 13.6B ±10% 12.6B ±11% ~ (p=0.111 n=5+5)
DBWriteUncompressed-8 73.2B ± 2% 72.0B ± 0% ~ (p=0.167 n=5+5)
DBWriteBatchUncompressed-8 14.0B ±29% 11.4B ± 5% -18.57% (p=0.048 n=5+5)
DBWriteRandom-8 160B ± 3% 138B ± 7% -13.87% (p=0.008 n=5+5)
DBWriteRandomSync-8 231B ± 6% 222B ±12% ~ (p=0.341 n=5+5)
DBOverwrite-8 125B ± 2% 106B ± 1% -14.77% (p=0.008 n=5+5)
DBOverwriteRandom-8 192B ± 5% 164B ± 6% -14.88% (p=0.008 n=5+5)
DBPut-8 74.0B ± 0% 73.0B ± 0% -1.35% (p=0.008 n=5+5)
DBRead-8 16.0B ± 0% 15.6B ± 4% ~ (p=0.444 n=5+5)
DBReadGC-8 16.0B ± 0% 16.0B ± 0% ~ (all equal)
DBReadUncompressed-8 15.0B ± 0% 14.0B ± 0% -6.67% (p=0.000 n=5+4)
DBReadTable-8 16.0B ± 0% 15.0B ± 0% -6.25% (p=0.000 n=5+4)
DBReadReverse-8 64.0B ± 0% 64.0B ± 0% ~ (all equal)
DBReadReverseTable-8 64.0B ± 0% 64.0B ± 0% ~ (all equal)
DBSeek-8 1.76kB ± 0% 1.76kB ± 0% -0.14% (p=0.008 n=5+5)
DBSeekRandom-8 1.78kB ± 1% 1.75kB ± 0% -1.49% (p=0.008 n=5+5)
DBGet-8 801B ± 1% 800B ± 1% ~ (p=0.952 n=5+5)
DBGetRandom-8 1.06kB ± 1% 1.02kB ± 0% -3.21% (p=0.000 n=5+4)
DBReadConcurrent-8 11.0B ± 0% 13.4B ± 4% +21.82% (p=0.016 n=4+5)
DBReadConcurrent2-8 35.0B ± 3% 35.0B ± 0% ~ (p=1.000 n=5+5)
GetOverlapLevel0-8 5.09MB ±14% 5.15MB ±12% ~ (p=1.000 n=5+5)
GetOverlapNonLevel0-8 1.01MB ± 2% 1.01MB ± 1% ~ (p=1.000 n=5+5)
VersionStagingNonTrivial-8 805kB ± 0% 805kB ± 0% ~ (p=0.841 n=5+5)
VersionStagingTrivial-8 1.21MB ± 1% 1.21MB ± 1% ~ (p=0.690 n=5+5)
name old allocs/op new allocs/op delta
DefaultBatchWrite-8 53.0 ± 0% 53.0 ± 0% ~ (all equal)
FastAllocationBatchWrite-8 42.0 ± 0% 42.0 ± 0% ~ (all equal)
DBWrite-8 3.00 ± 0% 3.00 ± 0% ~ (all equal)
DBWriteBatch-8 0.00 0.00 ~ (all equal)
DBWriteUncompressed-8 3.00 ± 0% 3.00 ± 0% ~ (all equal)
DBWriteBatchUncompressed-8 0.00 0.00 ~ (all equal)
DBWriteRandom-8 3.00 ± 0% 3.00 ± 0% ~ (all equal)
DBWriteRandomSync-8 3.00 ± 0% 3.00 ± 0% ~ (all equal)
DBOverwrite-8 3.00 ± 0% 3.00 ± 0% ~ (all equal)
DBOverwriteRandom-8 3.40 ±18% 3.00 ± 0% ~ (p=0.444 n=5+5)
DBPut-8 3.00 ± 0% 3.00 ± 0% ~ (all equal)
DBRead-8 0.00 0.00 ~ (all equal)
DBReadGC-8 0.00 0.00 ~ (all equal)
DBReadUncompressed-8 0.00 0.00 ~ (all equal)
DBReadTable-8 0.00 0.00 ~ (all equal)
DBReadReverse-8 0.00 0.00 ~ (all equal)
DBReadReverseTable-8 0.00 0.00 ~ (all equal)
DBSeek-8 21.0 ± 0% 21.0 ± 0% ~ (all equal)
DBSeekRandom-8 24.6 ± 2% 23.0 ± 0% -6.50% (p=0.008 n=5+5)
DBGet-8 13.0 ± 0% 13.0 ± 0% ~ (all equal)
DBGetRandom-8 19.0 ± 0% 17.0 ± 0% -10.53% (p=0.000 n=5+4)
DBReadConcurrent-8 0.00 0.00 ~ (all equal)
DBReadConcurrent2-8 0.00 0.00 ~ (all equal)
GetOverlapLevel0-8 27.6 ± 2% 28.0 ± 0% ~ (p=0.556 n=5+4)
GetOverlapNonLevel0-8 3.00 ± 0% 3.00 ± 0% ~ (all equal)
VersionStagingNonTrivial-8 34.0 ± 0% 34.0 ± 0% ~ (all equal)
VersionStagingTrivial-8 36.0 ± 0% 36.0 ± 0% ~ (all equal)
name old speed new speed delta
DBWrite-8 50.7MB/s ± 0% 50.6MB/s ± 0% ~ (p=0.333 n=5+5)
DBWriteBatch-8 65.9MB/s ± 9% 69.5MB/s ± 3% ~ (p=0.548 n=5+5)
DBWriteUncompressed-8 45.3MB/s ± 8% 45.8MB/s ±12% ~ (p=0.841 n=5+5)
DBWriteBatchUncompressed-8 54.2MB/s ±24% 58.9MB/s ±14% ~ (p=0.548 n=5+5)
DBWriteRandom-8 19.0MB/s ±10% 20.4MB/s ±12% ~ (p=0.310 n=5+5)
DBWriteRandomSync-8 10.0kB/s ± 0% 16.0kB/s ±38% ~ (p=0.238 n=4+5)
DBOverwrite-8 23.6MB/s ±15% 25.6MB/s ±10% ~ (p=0.310 n=5+5)
DBOverwriteRandom-8 11.8MB/s ± 8% 12.4MB/s ± 6% ~ (p=0.151 n=5+5)
DBPut-8 47.6MB/s ± 8% 49.1MB/s ± 1% ~ (p=1.000 n=5+5)
DBRead-8 448MB/s ± 8% 472MB/s ± 5% ~ (p=0.056 n=5+5)
DBReadGC-8 427MB/s ± 0% 432MB/s ± 0% +1.08% (p=0.029 n=4+4)
DBReadUncompressed-8 480MB/s ± 0% 517MB/s ± 8% +7.57% (p=0.008 n=5+5)
DBReadTable-8 514MB/s ± 0% 532MB/s ± 1% +3.36% (p=0.008 n=5+5)
DBReadReverse-8 336MB/s ± 0% 338MB/s ± 0% +0.50% (p=0.016 n=5+5)
DBReadReverseTable-8 337MB/s ± 0% 338MB/s ± 0% +0.44% (p=0.016 n=5+5)
DBReadConcurrent-8 1.50GB/s ± 2% 1.42GB/s ± 4% -5.39% (p=0.008 n=5+5)
DBReadConcurrent2-8 1.30GB/s ± 2% 1.28GB/s ± 1% ~ (p=0.222 n=5+5)
Could you check this allocator? https://github.com/xtaci/smux/blob/master/alloc.go
Could you check this allocator? https://github.com/xtaci/smux/blob/master/alloc.go
It's limited to 64KB?