FATE icon indicating copy to clipboard operation
FATE copied to clipboard

100w 数据上传异常

Open songsong124 opened this issue 1 year ago • 2 comments
trafficstars

版本1.7.2 上传100w 的数据出现异常

are connecting to the correct HDFS RPC port Traceback (most recent call last): File "/data/projects/fate/fate/python/fate_arch/storage/hdfs/_table.py", line 77, in _put_all writer.write(hdfs_utils.serialize(k, v)) File "pyarrow/io.pxi", line 283, in pyarrow.lib.NativeFile.write File "pyarrow/error.pxi", line 99, in pyarrow.lib.check_status OSError: HDFS Write failed, errno: 255 (Unknown error 255) Please check that you are connecting to the correct HDFS RPC port

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "pyarrow/io.pxi", line 262, in pyarrow.lib.NativeFile.flush File "pyarrow/error.pxi", line 99, in pyarrow.lib.check_status OSError: HDFS Flush failed, errno: 255 (Unknown error 255) Please check that you are connecting to the correct HDFS RPC port

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/data/projects/fate/fateflow/python/fate_flow/worker/task_executor.py", line 195, in run cpn_output = run_object.run(cpn_input) File "/data/projects/fate/fateflow/python/fate_flow/components/_base.py", line 149, in run method(cpn_input) File "/data/projects/fate/fateflow/python/fate_flow/components/upload.py", line 203, in _run data_table_count = self.save_data_table(job_id, name, namespace, storage_engine, head) File "/data/projects/fate/fateflow/python/fate_flow/components/upload.py", line 233, in save_data_table self.upload_file(input_file, head, job_id, input_feature_count) File "/data/projects/fate/fateflow/python/fate_flow/components/upload.py", line 338, in upload_file table.put_all(data) File "/data/projects/fate/fate/python/fate_arch/storage/_table.py", line 122, in put_all self._put_all(kv_list, **kwargs) File "/data/projects/fate/fate/python/fate_arch/storage/hdfs/_table.py", line 79, in _put_all counter = counter + 1 File "pyarrow/io.pxi", line 132, in pyarrow.lib.NativeFile.close File "pyarrow/error.pxi", line 99, in pyarrow.lib.check_status OSError: HDFS Flush failed, errno: 255 (Unknown error 255) Please check that you are connecting to the correct HDFS RPC port

songsong124 avatar Feb 21 '24 03:02 songsong124

1.7版本的spark引擎在数据上传阶段应该是有问题的,官方在后续版本修复过,具体哪个版本修的不太确定,1.11版本至少是没有这个问题了。

hust-suwb avatar Feb 28 '24 07:02 hust-suwb

hdfs-site.xml 中添加参数 dfs.client.block.write.replace-datanode-on-failure.policy NEVER

解决了上面上传的问题、我这是搭建的单机版本的hadoop\spark 、目前各流程正常运行。

songsong124 avatar Feb 28 '24 07:02 songsong124