pmemkv-java icon indicating copy to clipboard operation
pmemkv-java copied to clipboard

What is the actual storage consumption for KV pairs in pmemkv-java?

Open ch2994 opened this issue 5 years ago • 5 comments

Hello,

I have simple modified MixedExample.java that tries to put four 4MB Java ByteBuffer into a 32MB vcmap engine. The code compiles but to my surprise it throws out of memory exception when it tries to put the 4th ByteBufffer.

It looks like when put into pmemkv-java, the key-value pair consumes more space then it appears. Can you confirm if this is the case? If so, is there a way we can calculate how much storage is needed before we store the data? And is there an API to get the available storage from an existing pmemkv-java database?

Thanks

MixedExample.txt

ch2994 avatar Dec 03 '20 05:12 ch2994

For very small pools (like 32MB) you may see very big (relatively to pool size) overhead in memory footprint, due to data structure meta-data (which may vary between engines) ,pool meta-data, and data fragmentation. In general, every data structure (also for dram) consumes some additional space for meta data.

karczex avatar Dec 04 '20 14:12 karczex

We should add test for storage overhead.

karczex avatar Dec 09 '20 13:12 karczex

Thank you Karczex. I understand that there will be overhead with any kv store but what we have been observing is that for a 32MB pmemkv-java database we can only store 8MB worth of data (sometimes we can store 12MB, I don't know if there is another issue here).

So is there a way we can find out for a given size of data, how much storage is actually used by pmemkv-java?

Thanks

ch2994 avatar Dec 13 '20 19:12 ch2994

Unfortunately there is no easy way to do this for now. We have some ideas, see: https://github.com/pmem/pmemkv/issues/671 but it's not implemented yet.

igchor avatar Dec 15 '20 13:12 igchor

We've implemented a very simplistic way to check the approx. storage consumption of pmemkv. If you're still interested, pls see this example: https://github.com/pmem/pmemkv/blob/master/examples/pmemkv_fill_cpp/pmemkv_fill.cpp

Once run, you can estimate the records count, which will fit into database with specific engine and key/value sizes. That's all approximate, but it may give you some answers.

lukaszstolarczuk avatar Apr 22 '21 14:04 lukaszstolarczuk