luigi icon indicating copy to clipboard operation
luigi copied to clipboard

Incorrect return type in on_failure() causes batch emails to silently fail

Open Gollum999 opened this issue 4 years ago • 1 comments

Environment:

  • Python 3.7
  • luigi 2.8.3

The docs for Task.on_failure state:

The returned value of this method is json encoded and sent to the scheduler as the expl argument. Its string representation will be used as the body of the error email sent out if any.

However, this is a bit misleading - based on the current behavior, it appears to assume that the return value is already a string. If that is not the case, batch emails will silently fail to send. (Per-worker emails still seem to work fine.)

This should reproduce the error (running against a scheduler that has batch_emails enabled):

class BugTask(luigi.Task):
    def on_failure(self, exception):
        print('failed!', exception)  # Whoops, implicitly returning None

    def run(self):
        raise RuntimeError('ERROR')

I think there are two problems contributing to this:

  1. BatchNotifier._format_expl assumes that expl is already a str when it calls expl.rstrip().
    1. As far as I can tell, expl ultimately comes from here.
  2. BatchNotifier.send_email wraps everything in a try...finally block, swallowing the AttributeError from the above call to rstrip. EDIT: I'm forgetting my Python. finally won't suppress the error, so this problem lies somewhere else (potentially in our logging config).

Gollum999 avatar Sep 25 '20 20:09 Gollum999

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. If closed, you may revisit when your time allows and reopen! Thank you for your contributions.

stale[bot] avatar Jan 09 '22 02:01 stale[bot]