Mix error and then distribution disconnect
Cloned and then opened the Kaffe library with expert.
For some reason its talking about redefined the mixfile module.
I noticed below that its talking about engine nodes with different names (implying multiple engine nodes)
15:22:45.489 [info] Engine build available at: /Users/m.hanberg/Library/Application Support/Expert/0.1.0/elixir-1.18.4-erts-16.1.2/0d51bdf02df5dc05939d63e7a1b05646/_build/dev_ns
15:22:45.526 [debug] sent notification server -> client $/progress
15:22:46.475 [debug] Node port message: ok
15:22:46.606 [debug] Node port message:
15:22:46.598 [notice] Application mix exited: :stopped
15:22:46.620 [debug] Node port message: warning: redefining module Kaffe.Mixfile (current version defined in memory)
â
1 â defmodule Kaffe.Mixfile do
â ~~~~~~~~~~~~~~~~~~~~~~~~~~
â
ââ /Users/m.hanberg/src/other/kaffe/mix.exs:1: Kaffe.Mixfile (module)
15:22:47.824 [info] Child of Supervisor :inet_gethost_native_sup started
Pid: #PID<0.201.0>
Start Call: :inet_gethost_native.init([])
15:22:47.824 [debug] Child :inet_gethost_native_sup of Supervisor :kernel_safe_sup started
Pid: #PID<0.200.0>
Start Call: :inet_gethost_native.start_link()
Restart: :temporary
Shutdown: 1000
Type: :worker
15:22:47.823 [warning] 'global' at node :"[email protected]" requested disconnect from node :"[email protected]" in order to prevent overlapping partitions
15:22:47.831 [warning] 'global' at node :"[email protected]" disconnected node :"[email protected]" in order to prevent overlapping partitions
15:22:47.831 [debug] sent notification server -> client $/progress
15:22:47.831 [warning] 'global' at node :"[email protected]" disconnected node :"[email protected]" in order to prevent overlapping partitions
15:22:47.831 [error] Process #PID<0.174.0> terminating
** (exit) {:badrpc, :nodedown}
(stdlib 6.2.2.2) gen_server.erl:2210: :gen_server.init_it/6
(stdlib 6.2.2.2) proc_lib.erl:329: :proc_lib.init_p_do_apply/3
Initial Call: XPExpert.Project.Node.init/1
Ancestors: [:"kaffe::supervisor", XPExpert.ProjectSupervisor, XPExpert.Supervisor, #PID<0.158.0>]
Message Queue Length: 0
Messages: []
Links: [#PID<0.171.0>]
Dictionary: []
Trapping Exits: false
Status: :running
Heap Size: 2586
Stack Size: 29
Reductions: 25109
15:22:47.831 [error] Child {XPExpert.Project.Node, "kaffe"} of Supervisor :"kaffe::supervisor" failed to start
** (exit) {:badrpc, :nodedown}
Start Call: XPExpert.Project.Node.start_link(%XPForge.Project{root_uri: "file:///Users/m.hanberg/src/other/kaffe", mix_exs_uri: "file:///Users/m.hanberg/src/other/kaffe/mix.exs", mix_project?: true, mix_env: nil, mix_target: nil, env_variables: %{}, project_module: nil, entropy: 13854})
Restart: :permanent
Shutdown: 5000
Type: :worker
15:22:47.832 [error] Task #PID<0.170.0> started from #PID<0.169.0> terminating
** (MatchError) no match of right hand side value: {:error, {:shutdown, {:failed_to_start_child, {XPExpert.Project.Node, "kaffe"}, {:badrpc, :nodedown}}}}
(xp_expert 0.1.0) lib/expert/state.ex:59: anonymous fn/1 in XPExpert.State.initialize/2
(elixir 1.17.3) lib/task/supervised.ex:101: Task.Supervised.invoke_mfa/2
Function: #Function<3.104070401/0 in XPExpert.State.initialize/2>
Args: []
15:22:47.838 [error] Child :undefined of Supervisor :expert_task_queue terminated
** (exit) an exception was raised:
** (MatchError) no match of right hand side value: {:error, {:shutdown, {:failed_to_start_child, {XPExpert.Project.Node, "kaffe"}, {:badrpc, :nodedown}}}}
(xp_expert 0.1.0) lib/expert/state.ex:59: anonymous fn/1 in XPExpert.State.initialize/2
(elixir 1.17.3) lib/task/supervised.ex:101: Task.Supervised.invoke_mfa/2
Pid: #PID<0.170.0>
Start Call: Task.Supervised.start_link/?
Restart: :temporary
Shutdown: 5000
Type: :worker
I ran pkill beam to delete any beam instances and then started again and it worked.
I think maybe there was a zombie expert around and the clustering was getting weird? not sure how that could happen.
@mhanberg I think the warning: redefining module Kaffe.Mixfile warning happens since this was lexical and doesn't cause issues, but the distribution issue is intriguing, the stacktrace isn't particularly informative.
By any chance did you have epmd running? A while back when I was working on the epmdless PR I saw a similar issue(can't remember if it was the same kind of trace) and it went away after I ran killall epmd, I had a bunch of leftover instances that were still connected and spawning epmd and that broke distribution.
It is unintuitive because we're telling the vm to not use epmd, but that still happened.