erlangpl Thoughts and problems with connecting to nodes after erlangpl starts

Thoughts and problems with connecting to nodes after erlangpl starts

Open arkgil opened this issue 7 years ago • 2 comments

As followup to #11, couple of things which in my opinion need to be considered:

epl_tracer process needs to trap exits, to handle exit messages from remote tracer process (there is no spawn_monitor/4 API 😢 )
node name cannot be put into epl's config ETS table at startup, but rather be a part of epl_tracer's state and retrieved by calling this process
when epl_tracer sees that remote process exited, it should notify all subscribers and exit normally
I don't know if this is possible to query epmd for full node names. I know we can retrieve short node name via command line interface to epmd, but maybe there is some internal API for doing that
we probably need some "core" websocket handler, which won't be a part of any plugin, but will be used solely for selecting node to connect to, disconnecting from node, listing nodes etc.
and of course there is a problem with epl:command/2 API. If a epl_tracer is currently dead, calling process will crash with noproc and close WebSocket connection if it is a WebSocket handler. One way to solve this issue is to introduce "proxy" process which would be always available, monitoring epl_tracer, and all command requests would go through it. Other solution is to Let It Crash™.

Mar 06 '17 22:03 arkgil

@michalslaski I'll probably need some help with those if we are to develop this feature.

Mar 07 '17 08:03 arkgil

@arkgil

Let's try to use one of the supervisor's strategies for handling abnormal exits from remote tracer process. Currently the epl_sup supervisor has 'permanent' children. If we change it to the 'transient' restart strategy and then handle restart in epl_tracer:init/1, I think we should be good.
erlangpl script has to be started with the node name as an argument, but we need to consider also multi-node clusters. Maybe spawning one epl_tracer per each node in the cluster is the way to go. In such case we will add epl_tracer_sup supervisor and let it spawn one epl_tracer worker per node.
I agree. Such notification can be added to epl_tracer:init/1, assuming it is no longer 'permanent', but rather 'transient' instead.
Sometimes nodes in the cluster will be not only on the local machine, so I don't think epmd is of any use for us.

Mar 09 '17 14:03 michalslaski