fairing icon indicating copy to clipboard operation
fairing copied to clipboard

CI cases test_lightgbm failed sometimes

Open jinchihe opened this issue 6 years ago • 4 comments

Step #2: ________________________________ test_lightgbm _________________________________
Step #2: [gw6] linux -- Python 3.7.5 /usr/local/bin/python
Step #2:
Step #2:     def test_lightgbm():
Step #2:         file_dir = os.path.dirname(__file__)
Step #2:         notebook_rel_path = "../../../examples/lightgbm/distributed-training.ipynb"
Step #2:         notebook_abs_path = os.path.normpath(
Step #2:             os.path.join(file_dir, notebook_rel_path))
Step #2:         expected_messages = [
Step #2:             "Copying gs://fairing-lightgbm/regression-example/regression.train.weight",
Step #2:             "[LightGBM] [Info] Finished initializing network",  # dist training setup
Step #2:             "[LightGBM] [Info] Iteration:10, valid_1 l2 : 0.2",
Step #2:             "[LightGBM] [Info] Finished training",
Step #2:             "Prediction mean: 0.5",
Step #2:             ", count: 500"
Step #2:         ]
Step #2: >       run_notebook_test(notebook_abs_path, expected_messages)
Step #2:
Step #2: tests/integration/gcp/test_running_in_notebooks.py:50:
Step #2: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
Step #2: tests/integration/helpers.py:18: in run_notebook_test
Step #2:     output_path = execute_notebook(notebook_path, parameters=parameters)
Step #2: tests/integration/helpers.py:14: in execute_notebook
Step #2:     parameters=parameters)
Step #2: /usr/local/lib/python3.7/site-packages/papermill/execute.py:108: in execute_notebook
Step #2:     raise_for_execution_errors(nb, output_path)
Step #2: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
Step #2:
Step #2: nb = {'cells': [{'cell_type': 'code', 'metadata': {'inputHidden': True, 'hide_input': True}, 'execution_count': None, 'sour...nd
_time': '2019-11-22T07:14:42.232762', 'duration': 60.111942, 'exception': True}}, 'nbformat': 4, 'nbformat_minor': 2}
Step #2: output_path = '/tmp/tmp6ga1cg6y/out.ipynb'
Step #2:
Step #2:     def raise_for_execution_errors(nb, output_path):
...
>           raise error
Step #2: E           papermill.exceptions.PapermillExecutionError:
Step #2: E           ---------------------------------------------------------------------------
Step #2: E           Exception encountered at "In [8]":
Step #2: E           ---------------------------------------------------------------------------
Step #2: E           RuntimeError                              Traceback (most recent call last)
Step #2: E           <ipython-input-8-0e4bf631467b> in <module>
Step #2: E           ----> 1 lightgbm.execute(config=predict_params, docker_registry=DOCKER_REGISTRY)
Step #2: E
Step #2: E           /usr/local/lib/python3.7/site-packages/kubeflow_fairing-0.7.0.1-py3.7.egg/kubeflow/fairing/frameworks/lightgbm.py in
 execute(config, docker_registry, base_image, namespace, stream_log, cores_per_worker, memory_per_worker, pod_spec_mutators)
Step #2: E               315         config['machine_list_file'] = "mlist.txt"
Step #2: E               316     output_map = generate_context_files(
Step #2: E           --> 317         config, config_file_name, num_machines)
Step #2: E               318
Step #2: E               319     preprocessor = BasePreProcessor(
Step #2: E
Step #2: E           /usr/local/lib/python3.7/site-packages/kubeflow_fairing-0.7.0.1-py3.7.egg/kubeflow/fairing/frameworks/lightgbm.py in
 generate_context_files(config, config_file_name, num_machines)
Step #2: E               253                                                      MLIST_FILE_NAME)]
Step #2: E               254     entrypoint_file_name = _generate_entrypoint(
Step #2: E           --> 255         copy_files_before, copy_files_after, config_in_docker, init_cmds, copy_patitioned_files)
Step #2: E               256     output_map[entrypoint_file_name] = ENTRYPOINT
Step #2: E               257     output_map[utils.__file__] = os.path.join(
W
Step #2: E           --> 317         config, config_file_name, num_machines)
Step #2: E               318
Step #2: E               319     preprocessor = BasePreProcessor(
Step #2: E
Step #2: E           /usr/local/lib/python3.7/site-packages/kubeflow_fairing-0.7.0.1-py3.7.egg/kubeflow/fairing/frameworks/lightgbm.py in
 generate_context_files(config, config_file_name, num_machines)
Step #2: E               253                                                      MLIST_FILE_NAME)]
Step #2: E               254     entrypoint_file_name = _generate_entrypoint(
Step #2: E           --> 255         copy_files_before, copy_files_after, config_in_docker, init_cmds, copy_patitioned_files)
Step #2: E               256     output_map[entrypoint_file_name] = ENTRYPOINT
Step #2: E               257     output_map[utils.__file__] = os.path.join(
Step #2: E
Step #2: E           /usr/local/lib/python3.7/site-packages/kubeflow_fairing-0.7.0.1-py3.7.egg/kubeflow/fairing/frameworks/lightgbm.py in
 _generate_entrypoint(copy_files_before, copy_files_after, config_file, init_cmds, copy_patitioned_files)
Step #2: E               123
Step #2: E               124     # copying files that are common to all workers
Step #2: E           --> 125     buf.extend(_get_commands_for_file_ransfer(copy_files_before))
Step #2: E               126
Step #2: E               127     buf.append("echo 'All files are copied!'")
Step #2: E
Step #2: E           /usr/local/lib/python3.7/site-packages/kubeflow_fairing-0.7.0.1-py3.7.egg/kubeflow/fairing/frameworks/lightgbm.py in
 _get_commands_for_file_ransfer(files_map)
Step #2: E                92             cmds.append(storage_obj.copy_cmd(k, v))
Step #2: E                93         else:
Step #2: E           ---> 94             raise RuntimeError("Remote file {} does't exist".format(k))
Step #2: E                95     return cmds
Step #2: E                96
Step #2: E
Step #2: E           RuntimeError: Remote file gs://kubeflow-ci-fairing/lightgbm/example/model_2019_11_22_07_13_48.txt does't exist
Step #2:
Step #2: /usr/local/lib/python3.7/site-packages/papermill/execute.py:192: PapermillExecutionError

jinchihe avatar Nov 22 '19 07:11 jinchihe

Issue-Label Bot is automatically applying the label kind/bug to this issue, with a confidence of 0.99. Please mark this comment with :thumbsup: or :thumbsdown: to give our bot feedback!

Links: app homepage, dashboard and code for this bot.

issue-label-bot[bot] avatar Nov 22 '19 07:11 issue-label-bot[bot]

@abhi-g Any idea for the problem? Thanks.

jinchihe avatar Jan 03 '20 02:01 jinchihe

I'll take a look.at this.

On Thu, Jan 2, 2020 at 6:36 PM Jin Chi He [email protected] wrote:

@abhi-g https://github.com/abhi-g Any idea for the problem? Thanks.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/kubeflow/fairing/issues/425?email_source=notifications&email_token=ACZ2UZULQMKBTGSLPP7JYDTQ32QENA5CNFSM4JQMZK32YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEIADSGQ#issuecomment-570439962, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACZ2UZR55SDAYGT6AP3GKFDQ32QENANCNFSM4JQMZK3Q .

abhi-g avatar Jan 03 '20 03:01 abhi-g

/area engprod /priority p2

jtfogarty avatar Jan 15 '20 22:01 jtfogarty