sdk-java
sdk-java copied to clipboard
Java SDK is not reporting workflow_failures from the code path via the FailWorkflowExceptionTypes
Expected Behavior
Workflow Failed Cloud Metrics and Java SDK Metrics should "closely" match each other
Actual Behavior
Workflow Failed metrics differ between Temporal Cloud and SDK metrics.
Steps to Reproduce the Problem
- NDE should typically get retried through WFT, I'm not sure what exactly happened to there WFs, but I'm aware we can force the Workflow exec failure by setting WorkflowImplementationOptions with .setFailWorkflowExceptionTypes(NonDeterministicException.class)
- This allowed me to reproduce the metric discrepancy between SDK's workflow_failed vs server/cloud's temporal_cloud_v0_workflow_failed_count, which seems to be the main concern.
- it appears Java SDK is not reporting workflow_failures from the code path via the FailWorkflowExceptionTypes
- The only place that SDK reports workflow_failures is from: https://github.com/temporalio/sdk-java/blob/v1.23.2/temporal-sdk/src/main/java/io/temporal/internal/replay/ReplayWorkflowExecutor.java#L97, which is not reachable when a workflow fails with WorkflowExecutionException from here: https://github.com/temporalio/sdk-java/blob/v1.23.2/temporal-sdk/src/main/java/io/temporal/internal/replay/ReplayWorkflowRunTaskHandler.java#L2[…]60
Specifications
- Version:
- Platform: Temporal Cloud