Caster icon indicating copy to clipboard operation
Caster copied to clipboard

Discussion on managing Global State in Caster.

Open LexiconCode opened this issue 5 years ago • 8 comments

Before we talk about managing global state it is best to understand that grammar versus module reload ability. Currently grammar reload ability is being implemented by synkarius. While this does handle reloading grammars it does not handle reloading modules. With grammar reloading, only the module that the grammar belongs to will be reloaded. There is no consistent way to check for dependent modules and reload those too.

For instance, the Java grammar depends on standard.py, changes to java.py will be detected and it will reload fine, but changes to standard.py will not be detected and will not cause java.py to reload. Summing it up, if standard.py was edited, then anything that relies standard.py as a module will not utilize the updated standard grammars.

This has a profound impact on Rule Design.

As it stands as a good rule design means:

  1. Either a rule has no "must-reload-with" dependencies outside of its own module or
  2. Reloading a rule's dependencies still has to happen via rebooting Dragon.

To overcome this limitation directly leads into global state in Caster. If we want to implement in the future module reload ability we will need to manage global state.

So understanding how and when to use Nexus and techniques for best practice managing global state to achieve that goal is a question we need to address. @synkarius mentioned he could lay out present/post-ccrmerger-rewrite of Nexus and how it relates to this issue.

LexiconCode avatar Jul 13 '19 20:07 LexiconCode

Yup. Will do. I'll see if I can get that all laid out here by tomorrow.

synkarius avatar Jul 13 '19 22:07 synkarius

So to frame this response, let's clarify the question.

"What is the nexus and how is it supposed to be used?"

The nexus is the glue code for all the other bits of Caster. It is itself a singleton and holds a bunch of other things which are also effectively singletons. It is also a single point of access to those things. It was originally created to deal with the circular dependency issues which had been cropping up at the time.

It is also both presently (and will be post-rewrite) a good place to hold application state. In my opinion, you want to have state in one place in your application, not scattered throughout random modules. (It's also worth saying, I'm not pointing any fingers here at anyone but myself.) State is a really annoying thing to manage, so you don't want to be hunting it down everywhere.

For you non-programmers, "state" is the data which gets held in memory or written to disk after your application boots and while you're using it. In Caster's case, that's things like which rules are active at a given time, what's on the clipboard, which format is the default text format, and shortly, what the latest loaded version of a rule is. When we refer to something being "stateless", that means that it is composed of only functions which are guaranteed to give you the same outputs every time you give them the same inputs: there is no tracking of what has happened prior.

There are very few types of programs which let you do totally without state. (For you programmers again, sure, your application may be stateless, but what about those config files and the database you're connecting to?) Caster has and needs state to operate correctly, but the nexus is where that state should be. (At least unless someone wants to hook it up to Redux or SQLite -- any takers? 😆)

Relating this all back to grammar reloading: during the development of this rule reloading feature, it came up that the dependencies of a reloaded module will not be reloaded with it or watched for changes like the rule modules will. Having state scattered around Caster makes approaching any more comprehensive solutions than this simple "reload the rule module only" solution difficult. So for instance, it would be possible to reload either all of Caster except the nexus or selective parts of Caster excluding the nexus when a file change was detected -- if we could know that some important bit of application state wasn't getting thrown away in so doing.

Concluding, the nexus is (A) Caster's glue code and the place to be doing dependency injection and (B) where global state should be managed.

synkarius avatar Jul 15 '19 04:07 synkarius

I agree that it is necessary to maintain state, but I do not think a single big object is the way to do it. I do not think necessarily that "singletons are evil", like some people apparently do, but I think the nexus object does not offer enough benefits (easier avoiding circular dependencies? single point of access) over drawbacks, like increased inter-coupledness, almost impossibility to unit test and worse maintainability. See for example https://softwareengineering.stackexchange.com/questions/148108/why-is-global-state-so-evil#comment774111_148110. A thing that makes this even more problematic is that Nexus initializes itself as a side-effect importing, like many modules in Caster do, which is another thing not to be liked. That it is explicitly global, I don't see the point, but it is a detail.

comodoro avatar Jul 15 '19 13:07 comodoro

The rule modules will not need to import the nexus after this rewrite. They will be loaded dynamically via importlib, including Caster "starter" rules, the same way that the user directory rules are loaded now. So, after this rewrite, the rule modules will not be coupled with the nexus.

The nexus initializing itself was a mistake. It does not do that anymore after the rewrite.

As for where to hold state, what I expressed above is just my opinion. I am no longer as active in this community as I once was. At one point, I was kind of a BDFL. Now, I consider myself just a contributor, subject to the direction provided by the community. So if it is decided that one big stateful singleton is not the desired design, I am all for whatever alternative is reached, even including the status quo.

synkarius avatar Jul 15 '19 14:07 synkarius

I guess we have to wait and see then.

Well, based on what you have created, you at least "first among equals":) I wish I could be more definite, but I only have a strong feeling backed by what I read in various discussions.

comodoro avatar Jul 15 '19 15:07 comodoro

FWIW, I don't know the details of Caster, but I think there's something to be said for having a single, central place for keeping state. In addition to making it easier to find the state in the code, it can allow you to define an interface for other modules (internal or external) to store/access their state. This should make it possible to write complex modules while still supporting a reloading framework.

daanzu avatar Jul 17 '19 13:07 daanzu

This issue clarified https://github.com/dictation-toolbox/Caster/issues/385 some for me.

For modules dependency reloads would we use something like https://pythonhosted.org/watchdog/ to watch for changes on disk?

@synkarius can you describe more how you're thinking we would use the state to determine which modules to reload? Would the state hold information on which files depend on each other?

As far as global state, could Nexus focus on state related to rule management and other state could be stored separate? For example Nexus.comm and Nexus.clip probably won't ever relate to how we manage rules loading/reloading. I'm not sure Nexus.clip is actually used, but Nexus.comm could be managed by the Communicator class itself using this = sys.modules[__name__] and storing the state within this. Also, the state related to recognition such as Nexus.history could be split out separately from the state that manages rule loading/reloading.

lexxish avatar Jul 31 '19 17:07 lexxish

@lexxish Watchdog looks pretty nice. What I've put in is much more basic. I'm just keeping track of a list of files, hashing each one the first time it's read, and then on each scan (be that via a timer or a voice command), I'm just hashing the file again, and if the hash is different, I signal to the other new module management stuff that it changed and needs to be attempted to be reloaded.

As for how the future Nexus might determine which modules would need reloading, TBH, I haven't given much thought to it. Off the top of my head, maybe there is (A) an existing library which does this, (B) some built in functionality in Python, or (C) we just do something like reading imports out of ALL Caster files and then reload the ones which are from Caster. Option C gets pretty messy pretty quickly, especially if you're trying to do it in a way which prevents crashes, so preferably that wouldn't be it.

Your comments on narrowing the focus of the Nexus are insightful. In the rewrite, I have introduced another Nexus-like object, the "GrammarManager", which does exactly what you suggest: manage state related to rule management.

As for the other stuff you mentioned:

  • I agree that Nexus.comm could be moved out of the Nexus pretty easily.
  • Nexus.clip is used for the multi-clipboard functionality. It doesn't need to have its home in the Nexus either.
  • I have already moved Nexus.history to a more suitably localized location, in the rewrite branch. (If you're curious, you can look at the "ccrmerger2" branch of my fork.)

synkarius avatar Aug 01 '19 01:08 synkarius