FATE
FATE copied to clipboard
hetero_sshe_lr运行报错
使用hetero_sshe_lr组件报错,用的eggroll,具体如下:
eggroll报错信息
egg_pair_bootstrap.sh: unrecognized option '--python-venv'
strace: attach: ptrace(PTRACE_ATTACH, 26592): Operation not permitted
fateflow schedule报错信息
[ERROR] [2022-07-28 18:32:59,632] [202207281748184866520] [19823:140358513362752] - [_session.get_session_from_record] [line:397]: ('Failed to call command: CommandURI(_uri=v1/egg-pair/runTask) to endpoint: nodemanager:36203, caused by: ', <_Rendezvous of RPC that terminated with:
status = StatusCode.UNAVAILABLE
details = "failed to connect to all addresses"
debug_error_string = "{"created":"@1659004379.508005998","description":"Failed to pick subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3876,"referenced_errors":[{"created":"@1659004379.507975323","description":"failed to connect to all addresses","file":"src/core/ext/filters/client_channel/lb_policy/pick_first/pick_first.cc","file_line":395,"grpc_status":14}]}"
>)
fateflow 报错信息
[ERROR] [2022-07-28 18:32:30,224] [202207281748184866520] [9359:140566588012352] - [task_executor._run_] [line:243]: ('Failed to call command: CommandURI(_uri=v1/egg-pair/runTask) to endpoint: nodemanager:36203, caused by: ', <_Rendezvous of RPC that terminated with:
status = StatusCode.UNAVAILABLE
details = "Socket closed"
debug_error_string = "{"created":"@1659004330.266419735","description":"Error received from peer ipv4:192.167.0.3:36203","file":"src/core/lib/surface/call.cc","file_line":1055,"grpc_message":"Socket closed","grpc_status":14}"
>)
Traceback (most recent call last):
File "/data/projects/fate/eggroll/python/eggroll/core/client.py", line 84, in sync_send
response = _command_stub.call(request.to_proto())
File "/opt/app-root/lib/python3.6/site-packages/grpc/_channel.py", line 604, in __call__
return _end_unary_response_blocking(state, call, False, None)
File "/opt/app-root/lib/python3.6/site-packages/grpc/_channel.py", line 506, in _end_unary_response_blocking
raise _Rendezvous(state, None, None, deadline)
grpc._channel._Rendezvous: <_Rendezvous of RPC that terminated with:
status = StatusCode.UNAVAILABLE
details = "Socket closed"
debug_error_string = "{"created":"@1659004330.266419735","description":"Error received from peer ipv4:192.167.0.3:36203","file":"src/core/lib/surface/call.cc","file_line":1055,"grpc_message":"Socket closed","grpc_status":14}"
>
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/data/projects/fate/fateflow/python/fate_flow/worker/task_executor.py", line 195, in _run_
cpn_output = run_object.run(cpn_input)
File "/data/projects/fate/fate/python/federatedml/model_base.py", line 236, in run
self._run(cpn_input=cpn_input)
File "/data/projects/fate/fate/python/federatedml/model_base.py", line 314, in _run
this_data_output = func(*params)
File "/data/projects/fate/fate/python/federatedml/linear_model/bilateral_linear_model/hetero_sshe_logistic_regression/hetero_lr_guest.py", line 244, in fit
self.fit_binary(data_instances, validate_data)
File "/data/projects/fate/fate/python/federatedml/linear_model/bilateral_linear_model/hetero_sshe_logistic_regression/hetero_lr_guest.py", line 251, in fit_binary
self.fit_single_model(data_instances, validate_data)
File "/data/projects/fate/fate/python/federatedml/linear_model/bilateral_linear_model/hetero_sshe_linear_model.py", line 267, in fit_single_model
batch_labels = batch_data.mapValues(lambda x: np.array([x.label], dtype=self.label_type))
File "/data/projects/fate/fate/python/fate_arch/common/profile.py", line 318, in _fn
rtn = func(*args, **kwargs)
File "/data/projects/fate/fate/python/fate_arch/computing/eggroll/_table.py", line 91, in mapValues
return Table(self._rp.map_values(func))
File "/data/projects/fate/eggroll/python/eggroll/core/aspects.py", line 30, in wrapper
result = func(*args, **kwargs)
File "/data/projects/fate/eggroll/python/eggroll/roll_pair/roll_pair.py", line 798, in map_values
task_results = self._run_job(job=job)
File "/data/projects/fate/eggroll/python/eggroll/roll_pair/roll_pair.py", line 475, in _run_job
results.append(future.result())
File "/opt/rh/rh-python36/root/usr/lib64/python3.6/concurrent/futures/_base.py", line 432, in result
return self.__get_result()
File "/opt/rh/rh-python36/root/usr/lib64/python3.6/concurrent/futures/_base.py", line 384, in __get_result
raise self._exception
File "/data/projects/fate/eggroll/python/eggroll/core/datastructure/threadpool.py", line 51, in run
result = self.fn(*self.args, **self.kwargs)
File "/data/projects/fate/eggroll/python/eggroll/core/client.py", line 97, in sync_send
raise CommandCallError(command_uri, endpoint, e)
eggroll.core.client.CommandCallError: ('Failed to call command: CommandURI(_uri=v1/egg-pair/runTask) to endpoint: nodemanager:36203, caused by: ', <_Rendezvous of RPC that terminated with:
status = StatusCode.UNAVAILABLE
details = "Socket closed"
debug_error_string = "{"created":"@1659004330.266419735","description":"Error received from peer ipv4:192.167.0.3:36203","file":"src/core/lib/surface/call.cc","file_line":1055,"grpc_message":"Socket closed","grpc_status":14}"
>)