sismic icon indicating copy to clipboard operation
sismic copied to clipboard

Persisting / restoring interpreter state

Open sbidoul opened this issue 5 years ago • 20 comments

Hi,

I'm looking for a mechanism to persist / restore interpreter state.

My current approach is to persist Interpreter._configuration to database. To restore it I instantiate Interpreter wit a restored initial context, then set _configuration obtained from database. Of course, persisting _memory would be needed too if using history states.

Obviously, I'm using an non public API by accessing/restoring Interpreter._configuration directly.

This worked well until 1.3. It breaks with sismic 1.4 because of the TimedContextEvaluator which has it's own copy of the states _configuration. Since I don't restore it, it starts with an empty _configuration and it breaks when exiting state at https://github.com/AlexandreDecan/sismic/blob/b2f07086424f7f3dc8f92791242f9522f0f49fcf/sismic/code/context.py#L72.

Pickle is not an option for us for many reasons, mostly because it does not lend itself to data migration (if we evolve the state machine, it is possible to update _configuration during data migration, while updating a pickle is not practical).

So I'd like to discuss if pausing a state machine by persisting it to database makes sense as a use case for Sismic? If yes, what could be a stable API for persisting/restoring interpreter state?

sbidoul avatar Jan 14 '19 14:01 sbidoul

Hello,

Saving and loading statecharts at runtime is not part of what Sismic offers at the moment. I do understand how interesting this can be, but providing official support for this feature will require a lot of effort, especially if you want to be complete in this support (property statecharts, listeners, code evaluator, context providers, etc.).

Recent changes made to "isolate" some code evaluator features have indeed made workarounds more complex to implement (on the contrary, they greatly facilitate support for other languages or other code evaluator features).

A possible solution is to implement a "load/save protocol" for each object involved in the execution of the statechart. This protocol would consist in collecting, for each "component" of the execution, the variables defining its state and returning them (probably based on lists and dictionaries, in order to facilitate its export to formats such as json or yaml for example). If this is possible for objects such as the interpreter or code evaluator, it is more delicate for listeners who can be attached to the interpreter at runtime (and this includes, unfortunately, the context providers).

In your case, a possible workaround would be to override the value of _configuration in the TimeContextProvider. This provider is stored in the _time_provider variable of a PythonEvaluator.

Another possibility if the statechart is not doing any kind of interaction with "external objects" is to "replay" the sequence of events that lead to its current state. This can be done by recording all the events that are sent (and the time at which they are sent), either manually (e.g. using the trace provided by helpers.log_trace) or by creating a new listener that will be attached (see Interpreter.attach) and will store the "event consumed" meta-events (and the current time) sent to the statechart being executed.

It should also be possible to subclass a PythonEvaluator and to remove/mock its context providers if their features are not required by your use cases, but I wouldn't recommend this approach ;)

AlexandreDecan avatar Jan 14 '19 15:01 AlexandreDecan

Thanks a lot for the quick and detailed answer.

Replaying won't work because actions are typically not replayable (due to interactions with other statecharts or external systems or other database objects).

From my part I feel the main part of the load/save protocol is obviously states and memory. The context load/save protocol can be a noop and left to be provided by the user (a documented way to provide it is needed though). I need to learn more about the other context providers and listeners.

sbidoul avatar Jan 14 '19 15:01 sbidoul

The context providers are used by the code evaluator to help providing e.g. time-related predicates in the context that is exposed by the code evaluator. Previously, all these predicates were directly computed and executed by the code evaluator (making them not "portable" to other code evaluators).

The context providers are created by the code evaluator upon instanciation (usually, by the interpreter at creation time). These context providers are then attached to the interpreter so they are notified about what happens during execution (e.g. which are the states that are entered, exited, what is the current time, and so on). Due to this "attachment" process, not only their "internal memory" will have to be restored, but they also need to be attached to the interpreter as well (it's a kind of dependency injection). I don't see how we can circumvent this in the current implementation, unless we provide some additional parameters to the code evaluator to "disable" some context providers (so they won't trigger errors like the ones you got, but their predicates won't be available during execution :-/)

AlexandreDecan avatar Jan 14 '19 16:01 AlexandreDecan

I was thinking about disabling the TimeContextProvider, but I notice it provides an important active predicate which I need (and that is somewhat unrelated to time at first glance).

The EventContextProvider seems to be less of an issue, because its state seems to be reset when the state machine stabilizes (which is when we persist it).

sbidoul avatar Jan 14 '19 17:01 sbidoul

I would say that if you're not afraid of digging into Sismic internals, the "easiest" way (the more straightforward one) is probably to manually override the content of _configuration in the TimeContextProvider that is associated to the code evaluator.

I don't see any easy way to quickly provide a load/save option for an interpreter right now, and I won't really have time to consider this during the next few weeks (probably not before mid February to be honest). Because of the modular nature of Sismic, many informations related to the execution are spread in different objects at runtime, making it difficult to provide such support (and even if we manage to have a load/save feature, it won't be easy to allow statechart transformations in between).

To support both use cases (load/save and changing the active configuration), I think I'll need to remove the context providers and find another way to "isolate" them so they can be reused in other code evaluators as well. Or maybe directly support them in an Interpreter, so that the whole current state of the execution is in a single place.

AlexandreDecan avatar Jan 14 '19 19:01 AlexandreDecan

Short term the easiest is to pin my dependency at 1.3 :) No urgency at all, thanks for all the feedback. Happy to discuss further later.

sbidoul avatar Jan 14 '19 20:01 sbidoul

I quickly tried to move the internals required by time and event-related predicates to the Interpreter instead of context providers. It was quite easy to make the current PythonEvaluator work with this approach, and tests are not failing (at least on my computer, I'm still waiting for Travis for an in-depth testing).

Could you please have a look at the corresponding branch? https://github.com/AlexandreDecan/sismic/compare/no-context-provider?expand=1

I think it preserves the behaviour you had with 1.3. Basically, I added three new private attributes to an Interpreter, namely _entry_time, _idle_time and _sent_events. The two first ones store the time when a state is entered and when a state has one of its transition that is triggered. The last one captures all the events that are sent during the current macrostep.

To "freeze" an interpreter, assuming you do not rely on contracts (or at least, not using the __old__ variable that is still in PythonEvaluator), it should be sufficient to store the value of _memory (if history states are used), _statechart (or to provide a similar one upon instanciation), _configuration (it isn't duplicated anymore), clock, _entry_time and _idle_time (to support after and idle during code evaluation), both event queues (_internal_queue and _external_queue) and the _context of _evaluator (that can be provided as an initial context of the Interpreter). I hope I haven't forgotten anything ;-)

In case you want to change the statechart (adding or removing states, changing their names, etc.), you need to update the configuration AND the memory (if history states are used) AND entry/after_time (if you rely on after or idle at some point).

AlexandreDecan avatar Jan 14 '19 20:01 AlexandreDecan

@AlexandreDecan that was quick. My tests pass with the no-context-provider branch.

sbidoul avatar Jan 14 '19 21:01 sbidoul

The changes are included in Sismic 1.4.1. It will be available on PyPI in a few minutes.

AlexandreDecan avatar Jan 15 '19 08:01 AlexandreDecan

@AlexandreDecan thanks a bunch!

Regarding a future save/load API for the interpreter configuration, I have the same inventory as yours.

Here are my thoughts on a potential future load/save API:

  • save_configuration/load_configuration methods on Interpreter
  • save_configuration would export a dictionary with documented keys configuration, memory, clock, entry_time, idle_time, that's easy to serialize as json or yaml
  • IMO providing statechart and context at Interpreter instantiation is fine
  • regarding the queues, I'm not too sure: my intuition is that save_configuration happen in practice when queues have been consumed completely

sbidoul avatar Jan 15 '19 10:01 sbidoul

I think queues are to be saved as well, because Sismic supports "future events", so we can have events in the queue that are not yet processed because the current time is too low.

AlexandreDecan avatar Jan 15 '19 10:01 AlexandreDecan

A potential issue is that events can contain any python object, so they may not be serializable reliably.

save_configuration() could have parameters to select what to (not) save.

sbidoul avatar Jan 15 '19 13:01 sbidoul

The same holds for the execution context: it can contain any python object, even the ones that are not serializable. Pickling should be applied for them in that case, but pickling is not a perfect solution (especially if external objects are linked to the context). I think we shouldn't care about "how to save things", but rather about "what to save". We could have a save method that return a ExecutionState object, containing all the informations that are required to re-create an interpreter having the exact same state (assuming that no side effect occurred inbetween). Then, it should be up to the user to decide how to save this ExecutionState object (pickling, serialization, etc.).

AlexandreDecan avatar Jan 15 '19 14:01 AlexandreDecan

@abidoul Should we keep this issue opened? I don't have any short-term solution to propose...

AlexandreDecan avatar Jul 10 '19 10:07 AlexandreDecan

@AlexandreDecan Since you kindly restored the previous behavior of _configuration back in January, it works for me. I'm just uncomfortable using an undocumented data structure for persistence.

Should we keep this issue opened

If you think persisting/restoring state is a valid use case the issue could stay open as a feature request. But it's up to you :)

sbidoul avatar Jul 11 '19 10:07 sbidoul

Let's keep this issue open then ;) It's not on my priority list (notably because I don't know exactly how I could handle most use cases) but as we would like to support dynamic refactoring of statecharts, we will need (at some point) to persist/restore execution states (so that the refactoring could be applied on a statechart and all execution states related to its interpreters).

AlexandreDecan avatar Jul 11 '19 10:07 AlexandreDecan

hi @AlexandreDecan - sorry to ping an existing issue, but wanted to check something. Is it correct to say that if we have a very basic statechart (e.g. no timing, no contracts, no history states, etc.), it is sufficient to restore Interpreter._configuration? This seems to be backed up by my testing, but wanted to get some confirmation before marching ahead!

viggyfresh avatar Oct 03 '19 18:10 viggyfresh

Hello @viggyfresh,

It depends on what you mean by "very basic statechart" :-) At least, I would suggest to save/restore _configuration and _initialized (the latter is required to distinguish between an empty configuration because the statechart is not yet initialized, and an empty configuration because the statechart is in a final configuration). Also, _context could be required if you have internal variables in your statechart.

For other scenarios :

  • If you make use of time-related predicates, then clock, _time, _entry_time, _idle_time (if idle() is used).
  • If there are pending events in the queue, both _internal_queue and _external_queue are required.
  • If there are history states, _memory has to be saved.
  • If there are contracts, and if they were disabled, then _ignore_contracts need to be set as well.
  • If a different code evaluator is used, then _evaluator.
  • Finally, if there are property statecharts or other bound objects, _listeners should be persisted.

Notice that pickle should do the job (and possibly shelve/marshal as well) in most cases.

Can I ask you in which context(s) you make use of SIsmic and why do you require persistence? That's definitely something I would like to include in Sismic, but currently, there are too many possible use cases to cover all of them :-/

AlexandreDecan avatar Oct 03 '19 19:10 AlexandreDecan

Thanks for the speedy response! Our use case is pretty simple - we've got a statechart that follows a user's progress through various flows in our system. Our state persistence mechanism (so far) is just a Postgres table with an Enum column (where the Enum corresponds 1-to-1 with the states in the statechart). We're reluctant to pickle because if our flows change, then migrating becomes much more difficult.

I confirmed that my theory above works - I rehydrate _context via initial_context and force the state and initialization via _configuration and _initialized, as you suggested!

Thanks for the wonderful library - I look forward to tracking the answer to the persistence question 😄

viggyfresh avatar Oct 03 '19 22:10 viggyfresh

Thanks for your answer! This use case looks pretty similar to sbidoul's one ;)

AlexandreDecan avatar Oct 04 '19 07:10 AlexandreDecan