forest icon indicating copy to clipboard operation
forest copied to clipboard

Forest mpool should handle chain reorganisation

Open elmattic opened this issue 2 years ago • 2 comments

Issue summary

This issue is related to https://github.com/ChainSafe/forest/issues/2726. The root cause might be the same but it's not 100% proven at this point, but in both cases sending a message results in a failure.

For this issue, we would like to run the node and send a transaction until a fork is happening simultaneously.

Looking at past CI jobs for send failures, I could only find two:

https://github.com/ChainSafe/forest/actions/runs/6573803867/job/17858117090?pr=3609 https://github.com/ChainSafe/forest/actions/runs/6533513531/job/17739353068

The forest daemon errors are slightly different but close enough (a fork was also happening in the second case).

Task summary

  • [ ] Write a script that can send messages in an infinite loop to reproduce it. Beware this can take some time; I managed to reproduce the issue only after around 1500 runs. See https://github.com/ChainSafe/forest/issues/2726#issuecomment-1740597704
  • [ ] We would like to be more verbose in our wallet testing script. It would help to print the whole json payload of the message, plus the date and time at which the message is sent. This could help us better understand, as sending the message occasionally fails here.
  • [ ] Find out what needs to be done regarding our implementation of the message pool. We should be able to resend pending messages in case of a reorg.
  • [ ] Write one or many unit tests covering a chain reorg scenario.

Acceptance Criteria

  • [ ] All unit tests are passing.
  • [ ] The stress script is unable to reproduce the error, after running it for several days.

Other information and links

Lotus mpool implementation. Lotus devs also confirmed that their message pool can handle chain reorgs.

Once we have fixed this issue, we can try to enable the balance check again in the calibnet wallet check script.

elmattic avatar Oct 24 '23 13:10 elmattic