distributed icon indicating copy to clipboard operation
distributed copied to clipboard

Allow silencing of `WARNING - Compute Failed`

Open PGijsbers opened this issue 3 years ago • 0 comments

Related to https://github.com/dask/distributed/issues/1932, but it asks to silence for different API component and for all warnings.

I have a setup where I expect some jobs to fail, but I do not want this to be output to the log. Consider the MWE:

from dask.distributed import Client, LocalCluster, as_completed
import random

def do_experiment(i) -> float:
  if i > 0.5:
    raise ValueError("This is something that occurs sometimes")
  return i

cluster = LocalCluster(processes=False)
client =  Client(cluster)

futures = as_completed(
  client.map(do_experiment, [0] * 5)
)

for i, future in enumerate(futures):
  try:
    result = future.result()
    print("Job success:", result)
  except:
    print("Job had error.")

  futures.add(
    client.submit(
      do_experiment, random.random()
    )
  )
  
  if i > 10:
    break  # Normally would be interrupted after a certain time
    
futures.clear()
client.close()
client.shutdown()


which produces output (MacOS, Python 3.10.5, dask.distributed 2022.8.0):

Job success: 0
Job success: 0
Job success:2022-08-12 14:41:26,675 - distributed.worker - WARNING - Compute Failed
Key:       do_experiment-990420e7f85eb60a2e92f76e330bcd0d
Function:  do_experiment
args:      (0.838782283860812)
kwargs:    {}
Exception: "ValueError('This is something that occurs sometimes')"

2022-08-12 14:41:26,675 - distributed.worker - WARNING - Compute Failed
Key:       do_experiment-91fed211d8e1b35468b228563d70f176
Function:  do_experiment
args:      (0.6422373489856902)
kwargs:    {}
Exception: "ValueError('This is something that occurs sometimes')"

 0
Job success: 0
Job success: 0
Job had error.
Job had error.
Job success: 0.20617520191356742
Job success: 0.025953682383621612
2022-08-12 14:41:26,701 - distributed.worker - WARNING - Compute Failed
Key:       do_experiment-8b23c41059644c713488b7d31780303f
Function:  do_experiment
args:      (0.8186249161860835)
kwargs:    {}
Exception: "ValueError('This is something that occurs sometimes')"

2022-08-12 14:41:26,701 - distributed.worker - WARNING - Compute Failed
Key:       do_experiment-38c379a6a3cfcbce4df7137a338b1b58
Function:  do_experiment
args:      (0.5759990828414355)
kwargs:    {}
Exception: "ValueError('This is something that occurs sometimes')"

2022-08-12 14:41:26,707 - distributed.worker - WARNING - Compute Failed
Key:       do_experiment-1742f19a29e84c4e6999d86463fbf76c
Function:  do_experiment
args:      (0.8893344882046301)
kwargs:    {}
Exception: "ValueError('This is something that occurs sometimes')"

Job success: 0.031046945090379863
Job had error.
Job had error.

I don't care for these errors at the time they occur in the worker - I can deal with these errors as I process my futures. The only way around this that I see is to use silence_logs of the LocalCluster object, but I do not want to silence all warnings, just these that are caused by my code and are able to be processed as part of processing the done futures. If there are ever warnings that may inform me about the cluster health or something else dask-related, I would prefer to still see them.

PGijsbers avatar Aug 12 '22 12:08 PGijsbers