flyte icon indicating copy to clipboard operation
flyte copied to clipboard

[Core feature] Pyflyte 'task with different structure' should show delta

Open katrogan opened this issue 1 year ago • 8 comments

Motivation: Why do you think this is important?

It's difficult as a user when I try to register Flyte inventory and get back

details: task with different structure already exists with id resource_type:TASK project:"system" domain:"production" name:"task-name" version:"1234" 

but I can't tell what's changed in my local version of the task.

Since we already store the registered task template and have the current serialized version of the task template we're trying to register, it would be super slick if flytekit (pyflyte) could compute what parameters in the template changed across registration and highlight that to the end user. (same for workflows and launch plans)

Goal: What should the final outcome look like, ideally?

Should be straightforward to identify what has changed across registration versions when the back-end blocks a re-registration attempt

Describe alternatives you've considered

Current error response code user experience

Propose: Link/Inline OR Additional context

Much like we use json marshalling to compute the hash, we could just json serialization to compute the delta here.

No response

Are you sure this issue hasn't been raised already?

  • [X] Yes

Have you read the Code of Conduct?

  • [X] Yes

katrogan avatar Jan 23 '24 01:01 katrogan

cc @mhotan

katrogan avatar Jan 23 '24 01:01 katrogan

🤖

Thank you for your feature request. I understand the need for a more detailed error message when there's a conflict in task structure during registration. This would indeed improve the user experience by making it easier to identify what has changed across registration versions.

Currently, when there is a conflict in task structure during registration, the Flyte codebase generates an error message using the CompileError struct and its associated methods in the errors package. The error message could be something like MismatchingInterfaces and is generated by calling a specific function like NewMismatchingInterfacesErr("", ""). You can find more details about this in the compiler_error_test.go file.

The task registration process involves creating a TaskTemplate message as defined in the tasks.proto file. This template includes various fields like id, type, metadata, interface, custom, target, task_type_version, security_context, extended_resources, and config that define the task.

The comparison of task structures during the registration process is not explicitly handled in the Flyte codebase based on the provided context. However, the serialization of task templates for the purpose of comparison during the registration process is handled using the Protocol Buffers (protobuf) library. The protobuf library provides methods like XXX_Marshal, XXX_Unmarshal, XXX_Merge, XXX_Size, and XXX_DiscardUnknown for serializing and deserializing the data. You can find more details about this in the tasks.pb.go file.

To implement your feature request, we could enhance the error handling to include a comparison of the current and previous versions of the task template. This could be done by serializing both versions using the protobuf library and then comparing the serialized data. If there are any differences, these could be highlighted in the error message. This would require changes to the CompileError struct and its associated methods to include the comparison results in the error message.

I hope this provides some insight into how your feature request could be implemented. If you have any further questions or need more clarification, please let me know.

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

dosubot[bot] avatar Jan 23 '24 01:01 dosubot[bot]

You can assign this to me, pointers on how to get started is very much appreciated!

hampusrosvall avatar Feb 16 '24 06:02 hampusrosvall

Hi, @katrogan

Wanna making sure of few things before starting a PR. Thank you~

1. diff() on what?

compute what parameters in the template changed across registration

By parameters, did you mean like the task / workflow / launch plan source code itself (different code structure but with the same id) OR in container.args dict from registration request bad rpc response e.g.,

Input Request:

{ 
  "id": { 
    "resourceType": "TASK", 
    "project": "flytesnacks",
    "domain": "development",
    "name": "workflow.say_hello",
    "version": "1234"
  },
  "spec": {
    "template": {
      "id": {
        "resourceType": "TASK",
        "project": "flytesnacks",
        "domain": "development",
        "name": "workflow.say_hello",
        "version": "1234"
      },
      "type": "python-task",
      "metadata": {
        "runtime": {
          "type": "FLYTE_SDK",
          "version": "1.10.7",
          "flavor": "python"
        },
        "retries": {}
      },
      "interface": {
        "inputs": {},
        "outputs": {
          "variables": {
            "o0": {
              "type": {
                "simple": "STRING"
              },
              "description": "o0"
            }
          }
        }
      },
      "container": {
        "image": "basics:v1",
        "args": [
          "pyflyte-fast-execute",
          "--additional-distribution",
          "s3://my-s3-bucket/flytesnacks/development/M4GSLZXCTPK46HIL57X6NW6SBM======/fast229fe274ee6e0cb90581d25bcf670186.tar.gz",
          "--dest-dir",
          "/root",
          "--",
          "pyflyte-execute",
          "--inputs",
          "{{.input}}",
          "--output-prefix",
          "{{.outputPrefix}}",
          "--raw-output-data-prefix",
          "{{.rawOutputDataPrefix}}",
          "--checkpoint-path",
          "{{.checkpointOutputPrefix}}",
          "--prev-checkpoint",
          "{{.prevCheckpointPrefix}}",
          "--resolver",
          "flytekit.core.python_auto_container.default_task_resolver",
          "--",
          "task-module",
          "workflow",
          "task-name",
          "say_hello"
        ],
        "resources": {}
      }
    },
    "description": {
      "longDescription": {
        "format": "DESCRIPTION_FORMAT_RST"
      }
    }
  }
}

2. Where to diff()?

Should we handle delta computation in backend flyteadmin (language-agnostic) or in flytekit (python)?

Some tradeoff to consider:

  • proto file breaking change?
  • sdk generalizability?

cc @Future-Outlier

austin362667 avatar Feb 20 '24 08:02 austin362667

I think it'd be hard to diff the source code although that would be spectacular, but the diff check performed on the server doesn't even evaluate the source code so in this case it's not super relevant

Ideally we should diff across the task spec (like you dumped!) since that's used to compute the digest that ultimately determines whether the structure has changes across (re-)registration

  1. Great question, I like the idea of having a language agnostic implementation! We can use something like https://github.com/flyteorg/flyte/blob/962d45827646f106265630a0fb1d9ae74968bc32/flyteidl/protos/flyteidl/admin/workflow.proto#L89 and extend the error message to include structured data about what in the spec has changed

For example, this where we initialize the error message in flyteadmin code: https://github.com/flyteorg/flyte/blob/962d45827646f106265630a0fb1d9ae74968bc32/flyteadmin/pkg/errors/errors.go#L112-L126

katrogan avatar Feb 20 '24 17:02 katrogan

2. extend the error message to include structured data about what in the spec has changed

What if we just show diff as string, in order to prevent modify too much task proto buff?

austin362667 avatar Feb 21 '24 07:02 austin362667

@austin362667 can we close this out?

katrogan avatar May 14 '24 18:05 katrogan

Certainly! Thank you, Katrina.

austin362667 avatar May 15 '24 03:05 austin362667

thank you @austin362667 for all your work in improving this!!

katrogan avatar May 16 '24 17:05 katrogan