[BUG] Exception thrown when waiting for execution to finish
Describe the bug
The following exception is thrown sometimes when executing a workflow with a dynamic that spins up multiple tasks:
elif e.code() == grpc.StatusCode.NOT_FOUND:
> raise FlyteEntityNotExistException() from e
E flytekit.exceptions.user.FlyteEntityNotExistException: USER:EntityNotExist: error=None, cause=<_InactiveRpcError of RPC that terminated with:
E status = StatusCode.NOT_FOUND
E details = "failed to fetch object: rpc error: code = NotFound desc = request failed with status code 404. Body: {"code":5,"message":"object 's3://union-compute-us-east-1-delvebio/metadata/propeller/clinical-pipeline-development-f76e3bd80dbf548ae8dc/n1/data/0/n3/0/outputs.pb' not found","details":[]}"
E debug_error_string = "UNKNOWN:Error received from peer {created_time:"2024-05-10T12:35:04.72984-07:00", grpc_status:5, grpc_message:"failed to fetch object: rpc error: code = NotFound desc = request failed with status code 404. Body: {\"code\":5,\"message\":\"object \'s3://union-compute-us-east-1-delvebio/metadata/propeller/clinical-pipeline-development-f76e3bd80dbf548ae8dc/n1/data/0/n3/0/outputs.pb\' not found\",\"details\":[]}"}"
E >
Expected behavior
No exception when waiting for execution
Additional context to reproduce
execution = remote.execute(
name,
inputs=inputs,
version=version or None,
**kwargs,
)
remote.wait(execution)
Screenshots
No response
Are you sure this issue hasn't been raised already?
- [X] Yes
Have you read the Code of Conduct?
- [X] Yes
The error you're encountering, indicated by FlyteEntityNotExistException, suggests that the system is unable to locate a specific object, likely due to the dynamic task's outputs not being found at the expected location in S3. This could be a result of the object not being created or being deleted before retrieval. This issue is similar to problems discussed in Flyte GitHub issue #3639 where dynamic tasks sometimes lead to missing entities, causing failures when the system expects those entities to exist.
Would you like more detailed information on potential causes and solutions for this issue?
This is an AI-generated response and your feedback is appreciated! Please leave a 👍 if this is helpful and 👎 if it is not.
Mention @runllm to ask follow-up questions. See runllm.com for more!