elixir-omg
elixir-omg copied to clipboard
Show all byzantine events in `/status.get`
As a good actor in Plasma, I can see all byzantine exit events at once, So that I can challenge them easily (and make that bond money! 🤑)
From the Samrong incident, it seems that that byzantine_events
array is returning a limited number of events. Once those events are challenged, the watcher requires a restart to pick up additional events.
During the incident, there were 5 unchallenged_exits. The cycle of [status.get unchallenged exits, challenge, restart watcher] happened 3 times to clear all the events.
It would be helpful to get a full list of invalid_exits
and unchallenged_exits
for a single call to status.get
without having to restart.
Expanding on the comment left in https://github.com/omisego/devops/issues/100#issuecomment-504457243.
it seems that that byzantine_events array is returning a limited number of events
For the sake of clarity - it is not limiting the number of events in any way. It always returns all the events it knows about. The problem is in the cycle described:
During the incident, there were 5 unchallenged_exits. The cycle of [status.get unchallenged exits, challenge, restart watcher] happened 3 times to clear all the events.
The cycle resulted because syncing of block halted due to the unchallenged_exit
condition. In every "moment" of the cycle, full list of events was returned, according to the current "state of knowledge" of the watcher.
It is impossible to discern the invalidity of the exits, without pulling the invalidating blocks, and the latter is stalled because the watcher is in unchallenged_exit
state and not pulling new blocks. The "not pulling blocks" part is the basic measure implemented to protect against corrupt ledger state and prevent user from sending/receiving money via plasma.
The cycle of status.get
, challenge
, restart
is formally due to not following the protocol - the chain should be exited from at the first instance of unchallenged
.
Next to keep in mind is the conscious decision to not allow Watcher to "get out" of a "call to mass exit" condition automatically (e.g. when the late invalid exit gets challenged after all) - to avoid false positives (discussed ~2 months ago).
So, having said that we have the following options:
1/ live with it - "this should never happen" - let's rather focus on being vigilant about invalid exits and implementing the auto-challenger quickly to minimize impact. Also note that this is testnet specific - outside of testnet we will follow protocol (exit instead of try to rescue the chain), also the exit periods are much longer
2/ loosen up the logic behind unchallenged_exits
- potentially dangerous - e.g. allow to "pop-back" to validity and continue
3/ try to implement pro-modes to have (2/) but only opt in, just for our convenience.
I'm inclined towards (1/) as the clean solution. (2/) I don't like too much, (3/) would be some form compromise, but one I'd fear would end up being abused and also decrease the pressure to get the challenges right and on time, which is a needed pressure.
Oh and I'd also love to keep the logic behind unchallenged_exit
condition as simple as possible, have it only as the last resort safety switch, that never gets pulled but it will work if needed.
Shall we stick with (1/) on this one and close it with wontfix
label?