VDK_ATTEMPT_ID not accessible from within data job code or not changing when a job pod is restarted
What is the feature request? What problem does it solve? By definition, the attempt_id should change on every job execution attempt (i.e., every time a job is re-tried in full)
attempt_id An identifier to be associated with the current VDK
execution attempt.If left empty it will be auto-generated.
An instance of a running Data Job deployment is called an
execution. Data Job execution can run a Data Job one or more
times.Each distinct run would a single attempt.
Default value is: ''.
However, if a user wants to utilize the attempt_id to make their data job idempotent, they cannot, because it is not accessible from within the data job.
Additionally, it does not seem the id is ever updated once set by the Control Service at deploy time. It is basically the same as the execution_id when it should not be.
Suggested solution
The logic of setting the attempt_id is revised and the id is exposed as part of the job_input.
Additional context N/A
However, if a user wants to utilize the attempt_id to make their data job idempotent, they cannot, because it is not accessible from within the data job.
Can you please explain the real use case (not an abstract one) that requires opening this issue?
However, if a user wants to utilize the attempt_id to make their data job idempotent, they cannot, because it is not accessible from within the data job.
Can you please explain the real use case (not an abstract one) that requires opening this issue?
Sure, we have a client that currently uses the execution_id as a composite primary key for one of their database tables and does data de-duplication. However, if a Platform Error is raised, the job is re-executed, but the execution_id remains the same. This has led to data issues, where the primary key remains the same, but the value of a random column has changed. Their data job logic cannot properly handle such situations, and having a dynamic attempt_id means they can use it instead of the execution_id.