[Core feature] Pyflyte 'task with different structure' should show delta
Motivation: Why do you think this is important?
It's difficult as a user when I try to register Flyte inventory and get back
details: task with different structure already exists with id resource_type:TASK project:"system" domain:"production" name:"task-name" version:"1234"
but I can't tell what's changed in my local version of the task.
Since we already store the registered task template and have the current serialized version of the task template we're trying to register, it would be super slick if flytekit (pyflyte) could compute what parameters in the template changed across registration and highlight that to the end user. (same for workflows and launch plans)
Goal: What should the final outcome look like, ideally?
Should be straightforward to identify what has changed across registration versions when the back-end blocks a re-registration attempt
Describe alternatives you've considered
Current error response code user experience
Propose: Link/Inline OR Additional context
Much like we use json marshalling to compute the hash, we could just json serialization to compute the delta here.
No response
Are you sure this issue hasn't been raised already?
- [X] Yes
Have you read the Code of Conduct?
- [X] Yes
cc @mhotan
🤖
Thank you for your feature request. I understand the need for a more detailed error message when there's a conflict in task structure during registration. This would indeed improve the user experience by making it easier to identify what has changed across registration versions.
Currently, when there is a conflict in task structure during registration, the Flyte codebase generates an error message using the CompileError struct and its associated methods in the errors package. The error message could be something like MismatchingInterfaces and is generated by calling a specific function like NewMismatchingInterfacesErr("", ""). You can find more details about this in the compiler_error_test.go file.
The task registration process involves creating a TaskTemplate message as defined in the tasks.proto file. This template includes various fields like id, type, metadata, interface, custom, target, task_type_version, security_context, extended_resources, and config that define the task.
The comparison of task structures during the registration process is not explicitly handled in the Flyte codebase based on the provided context. However, the serialization of task templates for the purpose of comparison during the registration process is handled using the Protocol Buffers (protobuf) library. The protobuf library provides methods like XXX_Marshal, XXX_Unmarshal, XXX_Merge, XXX_Size, and XXX_DiscardUnknown for serializing and deserializing the data. You can find more details about this in the tasks.pb.go file.
To implement your feature request, we could enhance the error handling to include a comparison of the current and previous versions of the task template. This could be done by serializing both versions using the protobuf library and then comparing the serialized data. If there are any differences, these could be highlighted in the error message. This would require changes to the CompileError struct and its associated methods to include the comparison results in the error message.
I hope this provides some insight into how your feature request could be implemented. If you have any further questions or need more clarification, please let me know.
Sources
- flytepropeller/pkg/compiler/errors/compiler_error_test.go
- flyteidl/protos/flyteidl/core/tasks.proto
- flyteidl/gen/pb-go/flyteidl/core/tasks.pb.go
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.
Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.
If you want to continue the conversation, start your reply with @dosu-bot.
You can assign this to me, pointers on how to get started is very much appreciated!
Hi, @katrogan
Wanna making sure of few things before starting a PR. Thank you~
1. diff() on what?
compute what parameters in the template changed across registration
By parameters, did you mean like the task / workflow / launch plan source code itself (different code structure but with the same id) OR in container.args dict from registration request bad rpc response e.g.,
Input Request:
{
"id": {
"resourceType": "TASK",
"project": "flytesnacks",
"domain": "development",
"name": "workflow.say_hello",
"version": "1234"
},
"spec": {
"template": {
"id": {
"resourceType": "TASK",
"project": "flytesnacks",
"domain": "development",
"name": "workflow.say_hello",
"version": "1234"
},
"type": "python-task",
"metadata": {
"runtime": {
"type": "FLYTE_SDK",
"version": "1.10.7",
"flavor": "python"
},
"retries": {}
},
"interface": {
"inputs": {},
"outputs": {
"variables": {
"o0": {
"type": {
"simple": "STRING"
},
"description": "o0"
}
}
}
},
"container": {
"image": "basics:v1",
"args": [
"pyflyte-fast-execute",
"--additional-distribution",
"s3://my-s3-bucket/flytesnacks/development/M4GSLZXCTPK46HIL57X6NW6SBM======/fast229fe274ee6e0cb90581d25bcf670186.tar.gz",
"--dest-dir",
"/root",
"--",
"pyflyte-execute",
"--inputs",
"{{.input}}",
"--output-prefix",
"{{.outputPrefix}}",
"--raw-output-data-prefix",
"{{.rawOutputDataPrefix}}",
"--checkpoint-path",
"{{.checkpointOutputPrefix}}",
"--prev-checkpoint",
"{{.prevCheckpointPrefix}}",
"--resolver",
"flytekit.core.python_auto_container.default_task_resolver",
"--",
"task-module",
"workflow",
"task-name",
"say_hello"
],
"resources": {}
}
},
"description": {
"longDescription": {
"format": "DESCRIPTION_FORMAT_RST"
}
}
}
}
2. Where to diff()?
Should we handle delta computation in backend flyteadmin (language-agnostic) or in flytekit (python)?
Some tradeoff to consider:
proto filebreaking change?sdkgeneralizability?
cc @Future-Outlier
I think it'd be hard to diff the source code although that would be spectacular, but the diff check performed on the server doesn't even evaluate the source code so in this case it's not super relevant
Ideally we should diff across the task spec (like you dumped!) since that's used to compute the digest that ultimately determines whether the structure has changes across (re-)registration
- Great question, I like the idea of having a language agnostic implementation! We can use something like https://github.com/flyteorg/flyte/blob/962d45827646f106265630a0fb1d9ae74968bc32/flyteidl/protos/flyteidl/admin/workflow.proto#L89 and extend the error message to include structured data about what in the spec has changed
For example, this where we initialize the error message in flyteadmin code: https://github.com/flyteorg/flyte/blob/962d45827646f106265630a0fb1d9ae74968bc32/flyteadmin/pkg/errors/errors.go#L112-L126
2. extend the error message to include structured data about what in the spec has changed
What if we just show diff as string, in order to prevent modify too much task proto buff?
@austin362667 can we close this out?
Certainly! Thank you, Katrina.
thank you @austin362667 for all your work in improving this!!