FATE
FATE copied to clipboard
BaseLine model issue:num row of input(XXX) not equals to num row of output(0)
Describe the bug 在训练过程中又遇到了奇怪的python问题。
Expected behavior 具体是在 LocalBaseline 过程中,遇到的报错: “OSError: num row of input(39942) not equals to num row of output(0)” 这里的39942是输入的训练集条数。
DAG上一个步骤是 HeteroFeatureBinning,output数据看起来正常。
报错内容是: (venv) [root@vm_0_2_centos local_lr_0]# cat ERROR.log [ERROR] [2023-07-07 13:49:00,655] [202307071342154512590] [13008:139763075864384] - [task_executor.run] [line:243]: num row of input(39942) not equals to num row of output(0) Traceback (most recent call last): File "/data/projects/fate/fateflow/python/fate_flow/worker/task_executor.py", line 195, in run cpn_output = run_object.run(cpn_input) File "/data/projects/fate/fate/python/federatedml/model_base.py", line 236, in run self._run(cpn_input=cpn_input) File "/data/projects/fate/fate/python/federatedml/model_base.py", line 314, in _run this_data_output = func(*params) File "/data/projects/fate/fate/python/federatedml/util/io_check.py", line 38, in _func f"num row of input({input_count}) not equals to num row of output({output_count})") OSError: num row of input(39942) not equals to num row of output(0)
Desktop (please complete the following information):
- OS: CentsOS 7
- Version Ansible 1.8.1
以下为截图:
log中指示,代码中应该是调用了: https://github.com/FederatedAI/FATE/blob/master/python/federatedml/util/io_check.py 脚本中的 assert_io_num_rows_equal 函数。
这个是在比较啥呀?
我猜测啊,可能是模型训练好以后打分的时候出了问题,y_predict全空了;但是原始的python日志不知道在哪里找。logs文件夹里的日志太外层了。
同样的数据boost是跑成功的,LR我忘了调batch size,有点卡住,就本地baseline不知道啥问题 前置流程有:缺失值填补、标准化;旁边流程的联邦LR和SBT都能跑通
+1