FATE-Flow icon indicating copy to clipboard operation
FATE-Flow copied to clipboard

Add specific directory to store data to upload

Open asdfsx opened this issue 1 year ago • 2 comments

System information

  • FATE Flow version :2.0.0

Describe the feature and the current behavior/state.

Add a specific directory to store data to upload, the directory can be confined, and fate-flow search data to upload in this directory.

Any Other info.

I install fate-flow with python setup.py install, and install fate-flow to /data/projects/fate/venv/lib/python3.8/site-packages/ Then I upload data using command flow data upload -c examples/upload/upload_host.json, the content of upload_host.json

{
  "file": "examples/data/breast_hetero_host.csv",
  "head": true,
  "partitions": 16,
  "extend_sid": true,
  "meta": {
    "delimiter": ",",
    "match_id_name": "id"
  },
  "namespace": "experiment",
  "name": "breast_hetero_host"
}

But the upload job failed. In fateboard, the job's log shows that fate-flow search data file in/data/projects/fate/venv/lib/python3.8/site-packages/fate_flow-2.0.0-py3.8.egg

[ERROR][2024-01-16 03:51:53,303][1518][_wraps.run][line:87]: Traceback (most recent call last):
  File "/data/projects/fate/venv/lib/python3.8/site-packages/fate_flow-2.0.0-py3.8.egg/fate_flow/components/entrypoint/cli.py", line 90, in execute
    io_meta = execute_component(task_config)
  File "/data/projects/fate/venv/lib/python3.8/site-packages/fate_flow-2.0.0-py3.8.egg/fate_flow/components/entrypoint/cli.py", line 121, in execute_component
    component.execute(config, outputs)
  File "/data/projects/fate/venv/lib/python3.8/site-packages/fate_flow-2.0.0-py3.8.egg/fate_flow/components/cpn.py", line 40, in execute
    return self.callback(config, outputs)
  File "/data/projects/fate/venv/lib/python3.8/site-packages/fate_flow-2.0.0-py3.8.egg/fate_flow/components/components/upload.py", line 36, in upload
    upload_data(config, outputs)
  File "/data/projects/fate/venv/lib/python3.8/site-packages/fate_flow-2.0.0-py3.8.egg/fate_flow/components/components/upload.py", line 46, in upload_data
    data = upload_object.run(
  File "/data/projects/fate/venv/lib/python3.8/site-packages/fate_flow-2.0.0-py3.8.egg/fate_flow/components/components/upload.py", line 203, in run
    data_table_count = self.save_data_table(job_id)
  File "/data/projects/fate/venv/lib/python3.8/site-packages/fate_flow-2.0.0-py3.8.egg/fate_flow/components/components/upload.py", line 215, in save_data_table
    input_feature_count = self.get_count(input_file)
  File "/data/projects/fate/venv/lib/python3.8/site-packages/fate_flow-2.0.0-py3.8.egg/fate_flow/components/components/upload.py", line 307, in get_count
    with open(input_file, "r", encoding="utf-8") as fp:
FileNotFoundError: [Errno 2] No such file or directory: '/data/projects/fate/venv/lib/python3.8/site-packages/fate_flow-2.0.0-py3.8.egg/fate_flow/examples/data/breast_hetero_host.csv'

asdfsx avatar Jan 16 '24 06:01 asdfsx

By the way, I think the config file also can be separated from the project. Right now fate_flow_server searching for the config file stored in the directory {FATE_FLOW_PROJECT}/conf. Why not add an environment variable such as FATEFLOW_CONFPATH to specify where the config stored. Or add the parameter for fate_flow_server such as --config to specify the config when fate_flow start up

asdfsx avatar Jan 17 '24 01:01 asdfsx

Thank you for your feedback. Currently, examples need to be manually pulled to the local directory. We will also optimize the configuration directory you mentioned in the future.

zhihuiwan avatar Jan 17 '24 02:01 zhihuiwan