raiden Raiden event handling needs reviewing

Consider the following blockchain transactions:

Channel is closed
Channels is settled

If both of the blockchain events for the above transactions are fetched in the same batch, then:

Because of channel close the update transfer event may be emitted.
The events are processed in the order they are generated
The handling of the update transfer event assumes that the channel exists when it is processed
However, because of the settle the channel could have been cleared, invalidating the assumption.

The above would result raise an unrecoverable error.

The fix for the above, validation for the ContractSend events using the same rules as the preconditions checks is necessary. (So, for the case above, since the channel state was after close, then it should at most be a recoverable error)

Related: #6444

Aug 18 '20 15:08 hackaugusto

After some thought, there are multiple approaches that can be taken to fix this:

Give the correct copy of the chain_state to the event handler. The bug describe above happens because the handler is called with the state after the whole batch is processed, while it should be given the state emitted with the event itself.
- Upsides:
  - This should be easy to understand, and was the original implementation. However this was unintentionally changed to improve the performance of the system.
- Downsides:
  - This would remove the optimization which allows us to make a single deep copy per batch. This may or may not be a performance problem since deepcopy was "optimized".
  - All events would be processed, which can result in unnecessary blockchain transactions.
As described in the initial text of the issue, the same rules used for preconditions checks should be applied to the event handler. This would mean the code has to be reviewed and the handler has to be aware that the state may have changed before the handler being called.
- Upsides:
  - The single deepcopy per batch would still be allowed, so there would be no performance hit.
- Downsides:
  - This is generally harder to reason about, and I believe it will be a fragile solution.
Stop handling the ContractSend events returned by the dispatch, instead, the code should react to changes to the state itself. This is roughly the behavior that ready_to_process_events tries to achieve, but there it only works during startup. The idea would be to stop handling the events emitted by the dispatch, but instead to look at the node's queues to check if there are new transactions to be sent, only processing the ContractSend and Send events that are in the queues.
- Upsides:
  - The flag ready_to_process_events would be removed, since processing events would not happen automatically anymore. This has the benefit of having only one way of doing event processing.
  - Transactions that are added and removed from the queue in the same batch would not be sent anymore, since the state machine already has code to cleanup the transaction queue when necessary. This would not only be more optimal, but it would also fix the bug mentioned above, since the transaction would be sent.

I think the third option is the best, and it should be done for both protocol messages and transactions.

Aug 20 '20 14:08 hackaugusto

Further discussion:

Don't emit ContractSend and SendMessage events from the state transition, only store them in the State
Make the queues (pending_transactions and queueids_to_queues) ordered and versioned, so that changes can be queried easily
Trigger effects based on those queues
Remaining events are informational only

Sep 08 '20 13:09 palango

raiden raiden copied to clipboard

Raiden event handling needs reviewing

raiden
raiden copied to clipboard