cronos icon indicating copy to clipboard operation
cronos copied to clipboard

Problem: there's no compact historical state storage

Open yihuang opened this issue 3 years ago • 3 comments

Closes: #704 Solution:

  • Integration version store and streaming service.

👮🏻👮🏻👮🏻 !!!! REFERENCE THE PROBLEM YOUR ARE SOLVING IN THE PR TITLE AND DESCRIBE YOUR SOLUTION HERE !!!! DO NOT FORGET !!!! 👮🏻👮🏻👮🏻

PR Checklist:

  • [ ] Have you read the CONTRIBUTING.md?
  • [ ] Does your PR follow the C4 patch requirements?
  • [ ] Have you rebased your work on top of the latest master?
  • [ ] Have you checked your code compiles? (make)
  • [ ] Have you included tests for any non-trivial functionality?
  • [ ] Have you checked your code passes the unit tests? (make test)
  • [ ] Have you checked your code formatting is correct? (go fmt)
  • [ ] Have you checked your basic code style is fine? (golangci-lint run)
  • [ ] If you added any dependencies, have you checked they do not contain any known vulnerabilities? (go list -json -m all | nancy sleuth)
  • [ ] If your changes affect the client infrastructure, have you run the integration test?
  • [ ] If your changes affect public APIs, does your PR follow the C4 evolution of public contracts?
  • [ ] If your code changes public APIs, have you incremented the crate version numbers and documented your changes in the CHANGELOG.md?
  • [ ] If you are contributing for the first time, please read the agreement in CONTRIBUTING.md now and add a comment to this pull request stating that your PR is in accordance with the Developer's Certificate of Origin.

Thank you for your code, it's appreciated! :)

yihuang avatar Sep 27 '22 06:09 yihuang

Codecov Report

Merging #722 (f46daef) into main (be8d53d) will increase coverage by 0.59%. The diff coverage is 35.81%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #722      +/-   ##
==========================================
+ Coverage   33.99%   34.59%   +0.59%     
==========================================
  Files          28       37       +9     
  Lines        1506     2246     +740     
==========================================
+ Hits          512      777     +265     
- Misses        941     1388     +447     
- Partials       53       81      +28     
Impacted Files Coverage Δ
versiondb/backend_test_utils.go 0.00% <0.00%> (ø)
versiondb/multistore.go 0.00% <0.00%> (ø)
versiondb/store.go 0.00% <0.00%> (ø)
versiondb/streaming_service.go 0.00% <0.00%> (ø)
versiondb/utils.go 0.00% <0.00%> (ø)
versiondb/tmdb/history.go 58.13% <58.13%> (ø)
versiondb/tmdb/store.go 68.00% <68.00%> (ø)
versiondb/tmdb/iterator.go 88.00% <88.00%> (ø)
versiondb/sync.go 91.66% <91.66%> (ø)

codecov[bot] avatar Sep 27 '22 11:09 codecov[bot]

Some numbers get from local testing with rocksdb:

Space Amplification
(file size / data size)
Data Size[^1] File Size[^2] KV Pairs[^3] (manual compaction)
Space Amplification (file size)
Iavl 0.77 3368379952 2578108675 27994442 0.73 (2465786706)
VersionDB 0.84 566304124 473096919 5792002 0.79 (449149073)
- change set 0.82 536711835 438836315 5591300 0.8 (430133115)
- history index 0.54 20596897 11088590 100353 0.54 (11135326)
- plain db 2.58 8995392 23172014 100349 0.88 (7880632)

The ratio of versiondb/iavl file size reduction is: 0.18

[^1]: Sum of sizes of all key value pairs [^2]: Sum of sizes of db files. [^3]: Number of key value pairs.

yihuang avatar Sep 28 '22 06:09 yihuang

Application.db File Size KV Pairs Data Size Space Amplification
Rocksdb, archive 4.9G 31938977 3.7G 1.3
Rocksdb, prune=everything 438M 1481079 182.8M 2.4
MDBX, archive 6.5G 1.8
MDBX, prune=everything 735M 4
VersionDB File Size KV Pairs Data Size Space Amplification
Rocksdb
- Change set 153.5M 2082966 190.6M 0.8
- History index 39M 300456 30.9M 1.26
- Plain state 27.4M 300452 25.6M 1.07
MDBX
- Change set 218M 1.14
- History index 72M 2.4
- Plain state 62M 2.4

Just run some more tests to compare mdbx and rocksdb in terms of db size. Here we don't use the fancy features of mdbx like dupsort, just use the normal get/set/cursor apis to fit into the tm-db interface. So in terms of db size, rocksdb is not bad, it's kind of expected since mdbx don't do compression at all, and we don't use dupsort to compress the key prefixes. It's possible to optimize for mdbx in versiondb implementation though.

yihuang avatar Sep 29 '22 07:09 yihuang

there's a better approach to use user-timestamp feature of rocksdb v7: https://github.com/crypto-org-chain/cronos/pull/791

yihuang avatar Dec 27 '22 04:12 yihuang