a2ml icon indicating copy to clipboard operation
a2ml copied to clipboard

A2ML is not able to report about SIGKILL during task handling

Open holyketzer opened this issue 3 years ago • 0 comments

For example import data task https://app.auger.ai/admin/cluster_tasks/295252 with a big dataset causes SIGKILL:

[2021-02-24 10:34:41,824: INFO/MainProcess] Received task: a2ml.tasks_queue.tasks_hub_api.import_data_task[a211b108-2684-40e6-8ee6-dd82520e46ae]
[2021-02-24 10:34:43,864: WARNING/ForkPoolWorker-346] [warning] SOCKS support in urllib3 requires the installation of optional dependencies: specifically, PySocks. For more information, see https://urllib3.readthedocs.io/en/latest/contrib.html#socks-proxies
[2021-02-24 10:34:44,539: WARNING/ForkPoolWorker-346] [warning] FileType Enum is Deprecated in > 1.0.39. Use strings instead.
[2021-02-24 10:34:45,182: WARNING/ForkPoolWorker-346] [warning] Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3,and in 3.9 it will stop working
[2021-02-24 10:35:36,728: INFO/ForkPoolWorker-346] <azureml.core.authentication.ServicePrincipalAuthentication object at 0x7f4e0d4c54d0>
[2021-02-24 10:35:38,392: INFO/ForkPoolWorker-346] [azure] Convert file s3://auger-att-v89xvf/workspace/projects/william_test/files/d2JnHtSFuTq4k3Eyd5exBr-skill_level_vectors_pca_64-fe0c55.feather to parquet format.
[2021-02-24 10:37:34,563: ERROR/MainProcess] Process 'ForkPoolWorker-346' pid:14051 exited with 'signal 9 (SIGKILL)'
[2021-02-24 10:37:34,575: ERROR/MainProcess] Task handler raised error: WorkerLostError('Worker exited prematurely: signal 9 (SIGKILL).')
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/billiard/pool.py", line 1267, in mark_as_worker_lost
human_status(exitcode)),
billiard.exceptions.WorkerLostError: Worker exited prematurely: signal 9 (SIGKILL).

A2ML should be able somehow to report about error back to Hub

holyketzer avatar Feb 24 '21 10:02 holyketzer