filecoin-sealer-recover icon indicating copy to clipboard operation
filecoin-sealer-recover copied to clipboard

Sector (No.) , running PreCommit2 error: sealed cid mismatching!!!

Open Shekelme opened this issue 3 years ago • 12 comments

The following error occurs on some recovered sectors:

2021-12-25T12:02:35.621 INFO storage_proofs_core::data > dropping data /media/sn750/recover-57853956948449/sealed/s-t01222595-5785 2021-12-25T12:02:38.745 INFO filecoin_proofs::api::seal > seal_pre_commit_phase2:finish 2021-12-25T12:02:38.745 INFO filcrypto::proofs::api > seal_pre_commit_phase2: finish INFO[2021-12-25T12:02:38+03:00] Complete PreCommit2, sector ({1222595 5785}) ERRO[2021-12-25T12:02:38+03:00] Sector (5785) , running PreCommit2 error: sealed cid mismatching!!! (sealedCID: bagboea4b5abcb42beroeboobypwj2xvsg2ryc5xttathfol3tre3cioidvpjyhb5, newSealedCID: bagboea4b5abcbtgdq542zczhlmvif7ca5brg374gg5oo5hxfbc6xyjgiuazrpuao) INFO[2021-12-25T12:05:20+03:00] Complete sector (5785)

What causes it and how to fix it?

Shekelme avatar Dec 25 '21 09:12 Shekelme

Interesting. never saw this before. @FroghubMan can you help here.

dayou5168 avatar Dec 27 '21 09:12 dayou5168

We also found the same problem. About 1% of sectors cannot be recovered correctly. The problem has not been identified, but it is suspected that the problem may occur when the first seal produces wrong results (small probability event). The feedback from #5 and #8 is the same.

FroghubMan avatar Dec 27 '21 09:12 FroghubMan

I am trying to do a re-recovery for such sectors, but it does not help in all cases.

Shekelme avatar Dec 27 '21 09:12 Shekelme

@Shekelme Can you provide your miner id and sector number? maybe we can try one recovery test.

dayou5168 avatar Dec 27 '21 09:12 dayou5168

The numbers are all in OP ) Miner 1222595 , sector 5785. But also 7370, 13197 for example.

Shekelme avatar Dec 27 '21 10:12 Shekelme

Looks great.

dayou5168 avatar Dec 27 '21 10:12 dayou5168

If a small number of sectors cannot be recovered after repeated attempts, it is recommended to terminate as soon as possible.

FroghubMan avatar Dec 27 '21 10:12 FroghubMan

I am very curious, what kind of zfs failure caused sector data loss?

FroghubMan avatar Dec 27 '21 10:12 FroghubMan

The numbers are all in OP ) Miner 1222595 , sector 5785. But also 7370, 13197 for example.

In recent days, my worker machines have been very busy. There may be no way to help you.

FroghubMan avatar Dec 27 '21 10:12 FroghubMan

And a fresh one: 6028 For ZFS failure: link

Shekelme avatar Dec 27 '21 10:12 Shekelme

I found that the probability of being unable to recover is greater than 1%. I have tested many sectors and it seems that they are basically unable to recover. Now I still can't find the reason. I even modified the code myself, cancelled nodeapi and manually passed in ticket. In either case, the recovered CID is incorrect

s1mple1122 avatar Jan 20 '22 02:01 s1mple1122

@s1mple1122 you should check your chain data source if you have a larger portion of sectors that can't recover. maybe try to use a full node

dayou5168 avatar Jan 20 '22 08:01 dayou5168