pogreb icon indicating copy to clipboard operation
pogreb copied to clipboard

4 billion records max?

Open Kleissner opened this issue 4 years ago • 9 comments

I just realized that index.numKeys is a 32-bit uint, and there's MaxKeys = math.MaxUint32 😲

I think it would make sense to change it to 64-bit (any reason why we wouldn't support max 64-bit number of records)? I assume it would break existing dbs (but is still necessary)?

At least it should be clearly stated as limitation in the readme I would suggest.

Our use case is to store billions of records. We've reached already 2 billion records with Pogreb - which means in a matter of weeks we'll hit the current upper limit 😢

Kleissner avatar Jan 17 '21 04:01 Kleissner

Unfortunately just changing the constant to math.MaxUint64 won't work. Pogreb uses a 32-bit hash function. Storing more than math.MaxUint32 keys without changing the hash function to a 64-bit version would result in high rate of hash collisions and poor performance. Changing the hash function to a 64-bit version would require changing the internal bucket structure. It would add 4-byte disk space overhead for each key in the database. I'll consider changing it in the future.

Even storing a billion keys with a 32-bit hash function is not great. The closer to 4 billion you get, the more hash collisions you'll see.

For now, I would recommend sharding the database - running multiple databases.

Can you tell me more about how you use Pogreb? What is your typical access pattern? Is it write-heavy? What is your average key and value size?

akrylysov avatar Feb 10 '21 02:02 akrylysov

Apologies for the delay. The use case is for https://intelx.io storing hashes of all of our records in a key-value database which helps for some internal caching operations. The plan is to update the key-value store every 24 hours, so it would be "write-heavy-once" then read heavy.

We are still running into the other troubles (the weird disk errors coming from NTFS), but those I can handle/fix myself. If you would upgrade the code to support 64-bit amount of records that would be great, I believe many other people who are involved in those kinds of operations would hit that 4 billion record limit fairly quick as well.

For now I have shutdown the key-value store as we are too dangerously close to the 4 billion records and I'm afraid of hash collisions and false positive lookups.

Kleissner avatar Feb 25 '21 11:02 Kleissner

Thanks for the details! While the database will get slower as it gets close to 4 billion keys, it won't impact correctness, you don't need to worry about false positives. After doing a hash lookup Pogreb compares the key to the data in the WAL, so false positives are impossible.

akrylysov avatar Feb 25 '21 14:02 akrylysov

You can close all the issues that I opened. We stopped using Pogreb earlier this year when all those issues appeared. Unfortunately the 4 billion limit is an absolute breaker for us (we get now 4+ billion new records per month).

The plan was to keep the Pogreb running in parallel and switch over once the issues have been solved, but since this hasn't been resolved I have decided to switch over to a different key-value database.

Kleissner avatar Mar 11 '21 14:03 Kleissner

@Kleissner just curious, what are you using now?

derkan avatar Mar 11 '21 16:03 derkan

Yes, the 4B record limit is a deal breaker for me also. I was hoping on using this instead of bolt, but now cannot. Any chance of changing this? It is a real limit for people with large # items to manage. BTW, this is very impressive work.

gnewton avatar May 13 '21 18:05 gnewton

@derkan we have tried:

  • Postgres: Obviously an overkill for just storing key-value
  • Badger: Buggy, crashes sometimes, corrupts database. Updates break compatibility. Uses C code.
  • Bolt: No longer actively maintained, suffers from out of memory crashes and corrupted database.
  • Bitcask: High memory usage (more than disk).
  • Pogreb: Pure Go, but not more than 4 billion records supported

We fell back to continue using Bitcask, but half abandoned our internal project altogether since no suitable key-value database was found. Each new run takes a few weeks to recompile the key-value database (since we have billions of records) and is therefore resource and time intensive.

Kleissner avatar May 29 '21 15:05 Kleissner

@Kleissner have you check etcd-io/bbolt? It was a forked of bolt db and still maintained by etcd team

fahmifan avatar Dec 10 '21 14:12 fahmifan

@Kleissner just curious, what are you using now?

Look at PebbleDB. Ethereum Geth use it as blockchain storage.

artjoma avatar Feb 28 '24 17:02 artjoma