FATE icon indicating copy to clipboard operation
FATE copied to clipboard

Fate Flow: failed to connect to all addresses

Open silvanabc opened this issue 5 years ago • 3 comments
trafficstars

When I tried to run fate flow, I got the following error:

Traceback (most recent call last):
  File "/data/projects/fate/eggroll/python/eggroll/core/client.py", line 71, in sync_send
    response = _command_stub.call(request.to_proto())
  File "/data/projects/fate/common/python/venv/lib/python3.6/site-packages/grpc/_channel.py", line 565, in __call__
    return _end_unary_response_blocking(state, call, False, None)
  File "/data/projects/fate/common/python/venv/lib/python3.6/site-packages/grpc/_channel.py", line 467, in _end_unary_response_blocking
    raise _Rendezvous(state, None, None, deadline)
grpc._channel._Rendezvous: <_Rendezvous of RPC that terminated with:
        status = StatusCode.UNAVAILABLE
        details = "failed to connect to all addresses"
        debug_error_string = "{"created":"@1592207686.863929759","description":"Failed to pick subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3528,"referenced_errors":[{"created":"@1592207669.027053760","description":"failed to connect to all addresses","file":"src/core/ext/filters/client_channel/lb_policy/pick_first/pick_first.cc","file_line":399,"grpc_status":14}]}"
>

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/data/projects/fate/python/fate_flow/fate_flow_server.py", line 94, in <module>
    session_utils.init_session_for_flow_server()
  File "/data/projects/fate/python/fate_flow/utils/session_utils.py", line 60, in init_session_for_flow_server
    options={"eggroll.session.processors.per.node": 1})
  File "/data/projects/fate/python/arch/api/session.py", line 112, in init
    RuntimeInstance.SESSION = builder.build_session()
  File "/data/projects/fate/python/arch/api/impl/based_2x/build.py", line 38, in build_session
    persistent_engine=self._persistent_engine, options=self._options)
  File "/data/projects/fate/python/arch/api/impl/based_2x/session.py", line 45, in build_session
    eggroll_session = build_eggroll_session(work_mode=work_mode, job_id=job_id, options=options)
  File "/data/projects/fate/python/arch/api/impl/based_2x/session.py", line 36, in build_eggroll_session
    return session_init(session_id=job_id, options=options)
  File "/data/projects/fate/eggroll/python/eggroll/core/session.py", line 32, in session_init
    er_session = ErSession(session_id=session_id, options=options)
  File "/data/projects/fate/eggroll/python/eggroll/core/session.py", line 113, in __init__
    self.__session_meta = self._cluster_manager_client.get_or_create_session(session_meta)
  File "/data/projects/fate/eggroll/python/eggroll/core/client.py", line 176, in get_or_create_session
    serdes_type=self.__serdes_type))
  File "/data/projects/fate/eggroll/python/eggroll/core/client.py", line 226, in __do_sync_request_internal
    serdes_type=serdes_type)
  File "/data/projects/fate/eggroll/python/eggroll/core/client.py", line 54, in simple_sync_send
    results = self.sync_send(inputs=[input], output_types=[output_type], endpoint=endpoint, command_uri=command_uri, serdes_type=serdes_type)
  File "/data/projects/fate/eggroll/python/eggroll/core/client.py", line 84, in sync_send
    raise CommandCallError(command_uri, endpoint, e)
eggroll.core.client.CommandCallError: ('Failed to call command: CommandURI(_uri=v1/cluster-manager/session/getOrCreateSession) to endpoint: <ip>:4670, caused by: ', <_Rendezvous of RPC that terminated with:
        status = StatusCode.UNAVAILABLE
        details = "failed to connect to all addresses"
        debug_error_string = "{"created":"@1592207686.863929759","description":"Failed to pick subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3528,"referenced_errors":[{"created":"@1592207669.027053760","description":"failed to connect to all addresses","file":"src/core/ext/filters/client_channel/lb_policy/pick_first/pick_first.cc","file_line":399,"grpc_status":14}]}"

Can anyone please help?

silvanabc avatar Jun 15 '20 09:06 silvanabc

I meet the same problem,when I bind model. Have you solved that ? Thanks

meijuanwang avatar Aug 21 '20 06:08 meijuanwang

how you solved that? thanks

cnchenzz avatar Nov 20 '23 08:11 cnchenzz

大佬这个问题解决了吗? 求指教啊,同样的这个问题卡在这边很久了

jiejielu-0309 avatar Jun 19 '24 10:06 jiejielu-0309

大佬这个问题解决了吗? 求指教啊,同样的这个问题卡在这边很久了

能提供更详细的信息吗,这个一般是因为eggroll没有部署或者配置正确

sagewe avatar Jul 10 '24 07:07 sagewe