iroha icon indicating copy to clipboard operation
iroha copied to clipboard

Data corruption testing for Kura

Open SamHSmith opened this issue 2 years ago • 3 comments

What do you think we should be fault tolerant against? I think it's safe to assume that the storage will be reliable. If we want to run on a platform where it is uncertain if we are able to read data, that should be handled inside the platform specific block store.

I think worrying about sudden powerloss is not our problem. That's why you buy a UPS for your servers. Or have your own powerplant for your data center. That leaves us with worrying about sudden shutdown due to an EMP. Which if that happens you have way bigger problems and the data is probably ruined anyway.

What we do need is testing for the case of corrupted data on disk. Maybe we screwed it up due to sudden shutdown or it was just bitrot. Doesn't matter. We should not load a blockchain unless it is in perfect order, or we can recover the data fully.

Originally posted by @SamHSmith in https://github.com/hyperledger/iroha/pull/2397#discussion_r909504048

SamHSmith avatar Jun 29 '22 11:06 SamHSmith

Do we have any requirements for the fault tolerant? What should we take as a foundation before testing and fixing? Perhaps there is some other reference blockchain or something like that? @SamHSmith @appetrosyan

AlexStroke avatar Nov 14 '22 15:11 AlexStroke

Basically, use an iroha node to write muliple blocks to a block store. Take it offline and corrupt different data in the block store. When you restart the node it should get the longest possible uncorrupted chain of blocks from the block store and throw away the rest.

SamHSmith avatar Nov 30 '22 08:11 SamHSmith

Created 3 test case in Allure Test Ops https://soramitsu.testops.cloud/project/100/test-cases/2245?treeId=199 image.png

AlexStroke avatar Apr 16 '24 10:04 AlexStroke