titan
titan copied to clipboard
Reduce locking for blob cache hit queries
Description
Currently we query blob cache in BlobFileReader. It is done after getting file metadata (BlobStorage::FindFile) and getting the file reader from BlobFileCache, both of which require mutex lock. If we move blob cache query to before those two calls, we saves the two mutex lock and other overheads for a cache hit query. Demo can be find here: https://github.com/yiwu-arbug/titan/tree/cache titandb_bench shows 6% improvement in throughput for in-memory workload.
Benchmark:
# Fill 10G DB
./titandb_bench --db="/dev/shm/titan_bench" --use_existing_db=false --titan_min_blob_size=0 --value_size=1024 --num=10000000 --compression_type=none --benchmarks="fillseq"
# Readrandom with 20G blob cache
./titandb_bench --db="/dev/shm/titan_bench" --use_existing_db=true --titan_min_blob_size=0 --value_size=1024 --num=10000000 --compression_type=none --benchmarks="readrandom" --threads=40 --duration=300 --titan_blob_cache_size=20000000000
Result:
master: readrandom : 65.210 micros/op 613374 ops/sec; 608.4 MB/s (4599999 of 4599999 found)
demo: readrandom : 61.398 micros/op 651462 ops/sec; 646.1 MB/s (4823999 of 4823999 found)
what's left to be done on top of the demo:
- [ ] Do the same for Iterator path. The demo only change point-get path.
- [ ] Allocate blob cache memory using memory allocator: https://github.com/facebook/rocksdb/blob/master/include/rocksdb/cache.h#L78
- [ ] Remove
OwnedSlicewrapper. Cache rawchar*and use the deleter fromCacheAllocationPtras cache item deleter
Score
3000
SIG slack channel(must):
#sig-engine
Mentor
@Connor1996
Recommended Skills:
c++
Learning Materials(optional)
I use cache in Get or Iterator path, but found it will fail the BlobFileCorruptionErrorHandling test which change result pass through the cache by inject callback.
I wonder if any real use case which change data directly in storage level or can I just work around this test case .
Actually I prefer the former implementation of Version of blob files which can exploit the advantage of MVCC notion and it will downgrade to current implementation only if read operation encounter blob index of newer Version of blob files which is separated from LSM during Compaction. But large values is often separated from LSM during Flush, so that the downgrading situation always not likely to happen.
Actually I prefer the former implementation of
Versionof blob files which can exploit the advantage ofMVCCnotion and it will downgrade to current implementation only if read operation encounter blob index of newerVersionof blob files which is separated fromLSMduringCompaction. But large values is often separated fromLSMduringFlush, so that the downgrading situation always not likely to happen.
The former Version has an issue where depend on timing of it's creation and it's underlying rocksdb SupverVersion's creation, the wrong SuperVersion is held and lead to wrong read result. I don't remember the exact detail, but it is very hard, if ever possible, to get the ordering right.
/ping
pong! I am challenge bot.
I built the titan on the version in the demo, with rocksdb version 6.4.tikv.
./titandb_bench --db="/dev/shm/titan_bench" --use_existing_db=false --titan_min_blob_size=0 --value_size=1024 --num=10000000 --compression_type=none --benchmarks="fillseq"
Then I ran the benchmark command above, and got error as below:
titandb_bench: /home/ubuntu/workspace/titan/src/db_impl.cc:1335: void rocksdb::titandb::TitanDBImpl::GenerateCachePrefix(): Assertion `size > 0' failed.
Aborted (core dumped)
Is it related to the rocksdb version I built with ? @yiwu-arbug
I built the titan on the version in the demo, with rocksdb version 6.4.tikv.
./titandb_bench --db="/dev/shm/titan_bench" --use_existing_db=false --titan_min_blob_size=0 --value_size=1024 --num=10000000 --compression_type=none --benchmarks="fillseq"
Then I ran the benchmark command above, and got error as below:
titandb_bench: /home/ubuntu/workspace/titan/src/db_impl.cc:1335: void rocksdb::titandb::TitanDBImpl::GenerateCachePrefix(): Assertion `size > 0' failed. Aborted (core dumped)Is it related to the rocksdb version I built with ? @yiwu-arbug
@Connor1996 is this related to the cache key fix?
I built the titan on the version in the demo, with rocksdb version 6.4.tikv.
./titandb_bench --db="/dev/shm/titan_bench" --use_existing_db=false --titan_min_blob_size=0 --value_size=1024 --num=10000000 --compression_type=none --benchmarks="fillseq"
Then I ran the benchmark command above, and got error as below:
titandb_bench: /home/ubuntu/workspace/titan/src/db_impl.cc:1335: void rocksdb::titandb::TitanDBImpl::GenerateCachePrefix(): Assertion `size > 0' failed. Aborted (core dumped)Is it related to the rocksdb version I built with ? @yiwu-arbug
Yes, it's related to the rocksdb version you built with. It is based on a hack of rocksdb which allocates unique ids for directory, but 6.4.tikv branch will always return 0.
I built the titan on the version in the demo, with rocksdb version 6.4.tikv.
./titandb_bench --db="/dev/shm/titan_bench" --use_existing_db=false --titan_min_blob_size=0 --value_size=1024 --num=10000000 --compression_type=none --benchmarks="fillseq"
Then I ran the benchmark command above, and got error as below:
titandb_bench: /home/ubuntu/workspace/titan/src/db_impl.cc:1335: void rocksdb::titandb::TitanDBImpl::GenerateCachePrefix(): Assertion `size > 0' failed. Aborted (core dumped)Is it related to the rocksdb version I built with ? @yiwu-arbug
Yes, it's related to the rocksdb version you built with. It is based on a hack of rocksdb which allocates unique ids for directory, but 6.4.tikv branch will always return 0.
I see. Is this issue still open? I want to have a try on this. It's so kind of you if you could give me the rocksdb branch you mentioned. @Connor1996
I built the titan on the version in the demo, with rocksdb version 6.4.tikv.
./titandb_bench --db="/dev/shm/titan_bench" --use_existing_db=false --titan_min_blob_size=0 --value_size=1024 --num=10000000 --compression_type=none --benchmarks="fillseq"
Then I ran the benchmark command above, and got error as below:
titandb_bench: /home/ubuntu/workspace/titan/src/db_impl.cc:1335: void rocksdb::titandb::TitanDBImpl::GenerateCachePrefix(): Assertion `size > 0' failed. Aborted (core dumped)Is it related to the rocksdb version I built with ? @yiwu-arbug
Yes, it's related to the rocksdb version you built with. It is based on a hack of rocksdb which allocates unique ids for directory, but 6.4.tikv branch will always return 0.
I see. Is this issue still open? I want to have a try on this. It's so kind of you if you could give me the rocksdb branch you mentioned. @Connor1996
@yiwu-arbug
@rickif yes, the issue is still open. The benchmark used is posted in the issue summary above, and you can build it in Titan: mkdir -p build && cd build && cmake .. -DCMAKE_BUILD_TYPE=Release && make titandb_bench -j
OK. I wonder whether the the version a hack of rocksdb which allocates unique ids for directory that @Connor1996 mentioned is available now, or I need to implement it by myself.