luigi icon indicating copy to clipboard operation
luigi copied to clipboard

Exceptions in output()/complete() emit DEPENDENCY_MISSING events

Open Gollum999 opened this issue 4 years ago • 3 comments

Environment:

  • Python 3.7
  • luigi 2.8.3

As the title states, if an exception is thrown inside of Task.output() or Task.complete(), the worker will end up emitting a DEPENDENCY_MISSING event, which seems incorrect based on the name and the behavior of other event types. I would expect BROKEN_TASK event, or possibly just a regular FAILURE.

In my case, I am trying to add specialized error handling for scheduling failures, which according to worker.py should include errors in Task.complete() and Task.deps() (and by default, Task.output() and Task.requires() by association). However, I currently cannot distinguish via the Event API unexpected errors in complete() from missing external dependencies, which in our case are common and expected.

If this was an intentional choice, I'd be curious to hear the reasoning as I find it to be quite unintuitive.

(It's also worth noting that the documentation around event types is almost completely nonexistent, so there was a lot of experimentation and digging through source code before finding this behavior, but that is a separate issue.)

Gollum999 avatar Oct 12 '20 20:10 Gollum999

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. If closed, you may revisit when your time allows and reopen! Thank you for your contributions.

stale[bot] avatar Jan 09 '22 02:01 stale[bot]

What about this issue? Is this a bug? The situation is even worse when one of Range* tasks requires some WrapperTask - then any raised exception in WrapperTask requires emits DEPENDENCY_MISSING instead of BROKEN_TASK or FAILURE (and prevents inspecting exceptions as DEPENDENCY_MISSING event handler does not have exception argument!).

mateka avatar Apr 09 '24 12:04 mateka

The line emitting DEPENDENCY_MISSING is ten years old. I don't think anyone remembers the reasoning. The person who wrote it is still at Spotify, but they don't depend so much on Luigi anymore. If you think BROKEN_TASK would be preferable, I suggest contributing a PR. Most users would probably agree with you, and it is unlikely that any code would depend on getting DEPENDENCY_MISSING for an exception.

lallea avatar Apr 18 '24 12:04 lallea