temporal
temporal copied to clipboard
Provide a way to retrieve the list of failures for one activity
Is your feature request related to a problem? Please describe. When an activity retries more than one time, there is no way right now to access previous failures. Right now from both, the SDK and the UI, it is only possible to see the last failure.
Describe the solution you'd like It would be nice to have access to the activity failure history.
Describe alternatives you've considered
Additional context
Customers have asked how to know what the last failure was inside the activity code (either a server-side TimeoutFailure, in which case it would be helpful to know which timeout, or the activity throwing a retryable ApplicationFailure). I think this would be a useful feature.
Right now from both, the SDK and the UI, it is only possible to see the last failure.
It's not possible to see from inside an activity the failure of the last activity try (unless the activity uses a Client to describe the workflow and look at pendingActivities.lastFailure). To get the last failure in the Activity Context (like this), the Worker would need a Failure returned in PollActivityTaskQueueResponse (or a list of all previous Failures).
cc @mjameswh
I stumbled upon this problem recently. I had an activity failure due to a bug in my code. Temporal - as expected - was retrying the activity. However, only the first failure had a clue in about the bug. The rest of retried attempts failed due to db constraint violation, because the first failure left the system in semi-consistent state (unique ID was taken during the first attempt). It made the problem harder to debug, because I had to dig through worker logs to see the problem.
I think it would have helped if Temporal kept the details for the first failure too.