FATE icon indicating copy to clipboard operation
FATE copied to clipboard

BaseLine model issue:num row of input(XXX) not equals to num row of output(0)

Open tigflanker opened this issue 2 years ago • 3 comments

Describe the bug 在训练过程中又遇到了奇怪的python问题。

Expected behavior 具体是在 LocalBaseline 过程中,遇到的报错: “OSError: num row of input(39942) not equals to num row of output(0)” 这里的39942是输入的训练集条数。

DAG上一个步骤是 HeteroFeatureBinning,output数据看起来正常。

报错内容是: (venv) [root@vm_0_2_centos local_lr_0]# cat ERROR.log [ERROR] [2023-07-07 13:49:00,655] [202307071342154512590] [13008:139763075864384] - [task_executor.run] [line:243]: num row of input(39942) not equals to num row of output(0) Traceback (most recent call last): File "/data/projects/fate/fateflow/python/fate_flow/worker/task_executor.py", line 195, in run cpn_output = run_object.run(cpn_input) File "/data/projects/fate/fate/python/federatedml/model_base.py", line 236, in run self._run(cpn_input=cpn_input) File "/data/projects/fate/fate/python/federatedml/model_base.py", line 314, in _run this_data_output = func(*params) File "/data/projects/fate/fate/python/federatedml/util/io_check.py", line 38, in _func f"num row of input({input_count}) not equals to num row of output({output_count})") OSError: num row of input(39942) not equals to num row of output(0)

Desktop (please complete the following information):

  • OS: CentsOS 7
  • Version Ansible 1.8.1

以下为截图: 微信截图_20230707142030 微信截图_20230707141911 微信截图_20230707141846

tigflanker avatar Jul 07 '23 06:07 tigflanker

log中指示,代码中应该是调用了: https://github.com/FederatedAI/FATE/blob/master/python/federatedml/util/io_check.py 脚本中的 assert_io_num_rows_equal 函数。

这个是在比较啥呀?

tigflanker avatar Jul 07 '23 06:07 tigflanker

我猜测啊,可能是模型训练好以后打分的时候出了问题,y_predict全空了;但是原始的python日志不知道在哪里找。logs文件夹里的日志太外层了。

同样的数据boost是跑成功的,LR我忘了调batch size,有点卡住,就本地baseline不知道啥问题 前置流程有:缺失值填补、标准化;旁边流程的联邦LR和SBT都能跑通

微信截图_20230713111612

tigflanker avatar Jul 10 '23 09:07 tigflanker

+1

Edwin-Xu avatar Apr 17 '24 05:04 Edwin-Xu