Thespian icon indicating copy to clipboard operation
Thespian copied to clipboard

self.createActor() with globalName from inisde an ActorTypeDispatcher increases size of self

Open gerowam opened this issue 2 years ago • 1 comments

Environment: Linux ikeh2 5.13.0-27-generic #29~20.04.1-Ubuntu SMP Fri Jan 14 00:32:30 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux Python 3.8.10 thespian==3.10.6

Calling self.createActor(..., globalName='some_named_actor') from within an ActorTypeDispatcher correctly returns the address of some_named_actor if it exists. But every call increases the size of self. Specifically, it looks likes _myRef._addrManager._managed seems to get a new thespian.actors.ActorAddress every call, even when some_named_actor exists.

Running the attached example:

~/thespian_leaktest/$ python3 example.py
Wakeup #1; asize(self) = 22712 bytes
Wakeup #2; asize(self) = 24304 bytes
Wakeup #3; asize(self) = 25168 bytes
Wakeup #4; asize(self) = 26096 bytes
[...]
Wakeup #19; asize(self) = 39248 bytes

Also attached is similar output with a recursive memory inspection of the ActorTypeDispatcher instance after each wakeup, which calls createActor() with the same globalName. Note the memory size info requires the Pympler package.

Is this behaviour expected? If so, is there a way to get the address of a named actor from within an ActorTypeDispatcher instance without calling createActor()? Or is there a more suitable pattern for sharing access to a globally-referencable actor to other, less-persistent actors? My concern with simply storing the address of the global actor is that if it crashes (and is revived by some supervisory process), then the stored address will be invalidated with no way to know.

leaktest.tar.gz

gerowam avatar Jan 31 '22 13:01 gerowam

Hi @gerowam , thanks for the report and the corresponding analysis. I can confirm that I see just over 800 bytes of consumption for each createActor() call with a globalName specified.

The increased memory consumption is a necessity of the distributed nature of the Actor environment. In concordance with your concern about storing the global address locally, the Thespian implementation must check with the Admin each time a "new" address is requested, and there must be local storage associated with this new address; any attempt to locally cache that information would suffer the same out-of-date issue you noted.

Probably a better way to address this situation is to have an Actor that provides management capabilities for other Actors (let's call this the "Maker" actor in the following discussion, and the term "Global" refers to the corresponding global actor, while the term "Client" refers to an actor wishing to communicate with the global actor). In this configuration, the Thespian globalName is not necessarily needed, and the system will operate as described below:

  • Client needs to communicate to Global, so Client issues a (user-defined) "GetGlobalAddress" message to Maker. In this situation, Maker may have a globalName that identifies it, but since the functionality provided by Maker is very limited, it is fairly safe and not expected to crash or otherwise fail, thus the client can issue a single createActor(Maker, globalName = "maker") call and save the resulting address on self.
  • When Maker receives the "GetGlobalAddress", it checks an internal dictionary (on self) to see if it has an address for Global. Initially, it does not, so it will then issue a createActor(Global) call (here no globalName is needed). It then adds the address obtained from that createActor to the internal dictionary and then returns the address to the Client in the response to the "GetGlobalAddress" message.
  • The Maker also records in another self dictionary the address of each Client that it gave the global address to.
  • Subsequent Client "GetGlobalAddress" calls can be handled directly by the Maker from the main self dictionary.
  • If the Global Actor crashes or otherwise exits, the Maker will handle the ChildActorExited message by removing the self dictionary entry containing the address of Global. It may optionally send a (user-defined) "GlobalAddressNoLongerValid" message to the Client actors that were registered in the secondary self dictionary; this is optional but can make recovery faster, since this tells the Client actors that they should also flush any local copies of the Global actor address they are maintaining.
  • The next time a Client issues a "GetGlobalAddress" to the Maker, the self dictionary will not contain an address for Global, so it will re-perform the createActor process for Global to get a new instance.
  • If the Client Actors have cached the Global actor address, and the "GlobalAddressNoLongerValid" is not used, the clients will eventually receive a PoisonMessage on their attempts to send to the (no longer running) Global. At that point they should discard their local address for Global and re-issue a "GetGlobalAddress" to Maker.

In the above scenario, Maker is a very simple implementation, but since Maker also acts as the parent actor for Global (or any number of Global actors), it will actively know when Global exits and another version needs to be created. The Client actors require a slightly longer protocol to get the address of Global from Maker, but they don't need any additional complexity. In this scenario, the growth of memory needed for actor address management should be limited to only those cases where the Global actor exits and needs to be re-created.

kquick avatar Feb 07 '22 07:02 kquick