sdk-core icon indicating copy to clipboard operation
sdk-core copied to clipboard

[Feature Request] Can we have more descriptive worker side error log?

Open KMilhan opened this issue 3 years ago • 3 comments

Is your feature request related to a problem? Please describe.

While we changed the workflow/activity code and redeployed it, we expected the running workflow to use the new code. But instead, we faced a burst of the below message

  2022-09-16T04:51:36.050908Z ERROR temporal_sdk_core::worker::workflow::managed_run: Error in run machines, error: RunUpdateErr { source: Nondeterminism("Complete workflow machine does not handle this event: HistoryEvent(id: 66, Some(ActivityTaskScheduled))"), complete_resp_chan: Some(Sender { inner: Some(Inner { state: State { is_complete: false, is_closed: false, is_rx_task_set: true, is_tx_task_set: false } }) }) }

I wanted to understand the log and make the following actions but we didn't have any additional log nor any hint from above log line about what to check. I wish the description of the error log from the worker side is more verbose.

Describe the solution you'd like

More descriptive error logs from worker side

KMilhan avatar Sep 16 '22 05:09 KMilhan

Yes, this is a non-determinism issue. See https://docs.temporal.io/workflows (specifically https://docs.temporal.io/workflows/#code-changes-can-cause-non-deterministic-behavior).

I wish the description of the error log from the worker side is more verbose.

Any specific suggestions on what you would like to see here? Would a link to those docs be sufficient?

This error is written from our core engine, so I am transferring the issue there for some more discussion.

cretz avatar Sep 16 '22 12:09 cretz

To clarify here, we need contextual state on our Core logs. We need workflow ID, run ID, workflow type, etc.

cretz avatar Oct 10 '22 16:10 cretz

Will be fixed by #410

cretz avatar Oct 11 '22 22:10 cretz

There's more info available by default now.

Sushisource avatar Oct 18 '22 21:10 Sushisource