agatedb icon indicating copy to clipboard operation
agatedb copied to clipboard

Feature Request: cache system with precious memory usage limitation

Open zhangjinpeng87 opened this issue 2 years ago • 3 comments

Currently every sst is opened in mmap mode, we can't control the memory usage of agatedb with this mode, it will use as much as it can. This may cause OOM when agatedb embedded in other system (like in mongo's early days', its mmap engine caused many issues).

Implement a cache system that can control its total memory usage is very important for a storage engine, especially on the cloud environment where the resource is strictly limited. RocksDB's block-cache is good reference, it can limit the total memory usage and provide impressive access performance.

zhangjinpeng87 avatar Jun 15 '22 06:06 zhangjinpeng87

Need to consider how to cache vlog. Currently vlog is not organized as "blocks".

skyzh avatar Jun 15 '22 07:06 skyzh

Need to consider how to cache vlog. Currently vlog is not organized as "blocks".

Yes, currently Titan( TiKV's key value separation engine ) also doesn't cache blob files. But it is necessary to consider how to cache vlog, maybe just using system's page cache is feasible?

zhangjinpeng87 avatar Jun 27 '22 01:06 zhangjinpeng87

Suppose value is large (default 1KiB), just okay to store it as one item in cache per value? And the ValuePtr can be the key in cache.

Maybe one cache for block and one cache for vlog value or just use one cache (use the cache_id concept in rocksdb).

But through this way, value cache can only work for user query (which get vptr first) while the backgroud gc that iter on vlog cannot visit the cache, still issue I/O on disk. (well...seems no iter impl for vlog, I need to have a look to its format) (BTW, it seems that agatedb haven't implement vlog gc and I'm not familiar with badger...)

wangnengjie avatar Jul 07 '22 07:07 wangnengjie