python-rocksdb icon indicating copy to clipboard operation
python-rocksdb copied to clipboard

Disable Caching / Memory Leak?

Open quantology opened this issue 3 years ago • 1 comments

I've been using python-rocksdb for a large kv store (~100 million records). I'd like to disable any caching by rocksdb, since the access of those records is pretty random, and I'd like to minimize the memory footprint. I've found that, as the db is accessed more over time, the memory usage seems to continue to grow (either because of a memory or due to caching). My current solution is to restart the server whenever the memory reaches some threshold, but obviously that is non-ideal.

What is the recommended configuration (e.g. rocksdb.Options) for running python-rocksdb with the minimal memory footprint possible? Is there a way to entirely disable caching, so I can determine if there is a deeper memory leak at the root of this issue?

I'm currently using:

opts = rocksdb.Options()
opts.table_factory = rocksdb.BlockBasedTableFactory(
        block_cache=None,
        no_block_cache=True,
    )

Thanks for any advice!

quantology avatar Jul 07 '21 11:07 quantology

Hey. My biggest DB takes 900GB storage with more than 100 billion keys and many CF's. I have also disabled block_cache for the most CF's where i dont need any cached data. The process runs now 7 days and uses 5gb RES with a lot of iterration/puts/gets. I recommend to set for database option the following too:

  • db_write_buffer_size max total memory constumption for all write buffers. Without this the process RES usage will increase greatly 16+gb (For me this is 4GB)
  • max_total_wal_size Max total .LOG file size. This is useful when you want avoid long recovery delay. (For me this is 1GB)

iFA88 avatar Jul 07 '21 12:07 iFA88