polkadot DB corrupted: Corruption: force_consistency

DB gets corrupted and my node gets stuck restarting all the time. Only working solution was to resync incurring into downtime.

Role: validator
Running on Docker images v0.9.26
Flags: "--validator", "--name=legos-x", "--chain=kusama", "--prometheus-external", "--prometheus-port=9615", "--pruning=1000", "--telemetry-url", "wss://telemetry-backend.w3f.community/submit 1"
Logs:

2022-07-24 05:08:13 DB corrupted: Corruption: force_consistency_checks: VersionBuilder: L6 files are not sorted properly: files #25645794, #25645968. Repair will be triggered on next restart
2022-07-24 05:08:13 GRANDPA voter error: could not complete a round on disk: Database
2022-07-24 05:08:13 Essential task `grandpa-voter` failed. Shutting down service.
Error:
   0: Other: Essential task failed.

  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ BACKTRACE ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
   1: __libc_start_main<unknown>
      at <unknown source file>:<unknown line>

Run with COLORBT_SHOW_HIDDEN=1 environment variable to disable frame filtering.
Run with RUST_BACKTRACE=full to include source snippets.
...
...
...
2022-07-24 05:08:16 ⛓  Native runtime: kusama-9260 (parity-kusama-0.tx12.au2)
2022-07-24 05:08:18 DB has been previously marked as corrupted, attempting repair
Error:
   0: Backend error: Corruption: force_consistency_checks: VersionBuilder: L0 file #45913943 with seqno 3226520822 3226530550 vs. file #45914397 with seqno 3226529071 3226530539

  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ BACKTRACE ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
   1: __libc_start_main<unknown>
      at <unknown source file>:<unknown line>

Run with COLORBT_SHOW_HIDDEN=1 environment variable to disable frame filtering.
Run with RUST_BACKTRACE=full to include source snippets.

Jul 25 '22 12:07 polkalegos

We also experience this issue, our underlying cause tends to be out of memory errors. It would be ideal if those errors were captured and the database closed cleanly when they take place. we run archive nodes, that re-sync is a huge problem, soon reaching 500 GB DB size.

Jul 25 '22 14:07 rvalle

CC @arkpar

Jul 26 '22 06:07 sandreim

Same error on v0.9.29

Oct 02 '22 12:10 polkalegos

This looks like a rocksdb issue. I suggest switching to --database=paritydb

Oct 03 '22 09:10 arkpar

Isn't that database the alternative not recommended so far? @arkpar

Oct 03 '22 13:10 polkalegos

ParityDB is not experimental anymore, so you can use it.

Oct 04 '22 08:10 bkchr