sdk-java icon indicating copy to clipboard operation
sdk-java copied to clipboard

Java SDK is not reporting workflow_failures from the code path via the FailWorkflowExceptionTypes

Open jlacefie opened this issue 1 year ago • 1 comments

Expected Behavior

Workflow Failed Cloud Metrics and Java SDK Metrics should "closely" match each other

Actual Behavior

Workflow Failed metrics differ between Temporal Cloud and SDK metrics.

Steps to Reproduce the Problem

  • NDE should typically get retried through WFT, I'm not sure what exactly happened to there WFs, but I'm aware we can force the Workflow exec failure by setting WorkflowImplementationOptions with .setFailWorkflowExceptionTypes(NonDeterministicException.class)
  • This allowed me to reproduce the metric discrepancy between SDK's workflow_failed vs server/cloud's temporal_cloud_v0_workflow_failed_count, which seems to be the main concern.
  • it appears Java SDK is not reporting workflow_failures from the code path via the FailWorkflowExceptionTypes
  • The only place that SDK reports workflow_failures is from: https://github.com/temporalio/sdk-java/blob/v1.23.2/temporal-sdk/src/main/java/io/temporal/internal/replay/ReplayWorkflowExecutor.java#L97, which is not reachable when a workflow fails with WorkflowExecutionException from here: https://github.com/temporalio/sdk-java/blob/v1.23.2/temporal-sdk/src/main/java/io/temporal/internal/replay/ReplayWorkflowRunTaskHandler.java#L2[…]60

Specifications

  • Version:
  • Platform: Temporal Cloud

jlacefie avatar Jun 10 '24 13:06 jlacefie