ra
ra copied to clipboard
Avoid writing unsupported commands to log
Currently there is no mechanism to protect the ra machine from the scenario where a client writes an unsupported command to the log. When it tries to apply this command the server will crash, irrecoverably (unless support is implemented in the the current machine version). this is particularly tricky for commands that have changed in an unsupported manner, e.g. an additional field in a record. Such changes could well happen during machine upgrades.
There are two possible approaches that I see:
-
We implement an
is_supported/1optional callback inra_machine- before any command is written to the log we call this on the current machine version module returning a boolean. If we returnfalsethe command will either be dropped (if pipelined) or an error will be returned to the caller. -
We add versioning to all commands such that we can filter any commands greater than the current effective machine version. This breaks the external commands api so for me this is less favoured.
Another alternative would be to simply catch any errors thrown by apply/3 - at least this would be acceptable for pure state machines as there are no environmental factors to take into account and each member will run the same code and experience the same error. We could optionally add a machine callback to handle the error. And we can return the error to the caller although we need to be careful as state machines can get very large and the errors that go with them may include the full state.
@dumbbell @acogoluegnes @dcorbacho - Any thoughts on the above options?
About command versioning (idea 2), do you mean the code calling Ra would pass a version with each command, as a new argument? Yeah, I feel this is the responsibility the user of Ra and author of the Ra machine to take care of that, not Ra.
I prefer your idea of reporting a crash somehow. The Ra API should only return the class, reason and stacktrace of the exception, not the state which could be large and secret. Perhaps the exception and the state could be optionaly logged.
I also vote for the crash, I think the is_supported is going to be tedious and prone to omissions and errors.
Interesting as I personally would prefer is_supported as it would allow us to return a specific error message to caller which may then choose to send a different command, say from a previous version. I just thought of yet another option!
apply/3 implementations should have a "catch-all" clause and return their own errors to callers if they chose to. This doesn't require any changes. :)
Then I think is_supported should be mandatory, easier to mantain and carry over between versions.