pipelines icon indicating copy to clipboard operation
pipelines copied to clipboard

[backend] Can't run a data-passed pipeline many times with the same pipeline function name. Error: This step is in Error state with this message: Error (exit code 1): key unsupported: cannot get key for artifact location, because it is invalid

Open Shuai-Xie opened this issue 2 years ago • 2 comments

Environment

  • Kubeflow Pipelines Standalone on a local cluster.
  • KFP version: 1.7.0
  • KFP SDK version: 1.8.2

Steps to reproduce

Dear developers, I try to run the same pipeline many times with different arguments, which are line and count in the following official data_pass example.

In the beginning, the main pipeline function is named pipeline_func. And the pipeline runs successfully the 1st time, but for the 2nd time, the error happens, which is quite the same as this closed issue https://github.com/kubeflow/pipelines/issues/5948, also, has a connection with this issue https://github.com/argoproj/argo-workflows/issues/6497.

I change the main pipeline function name, trying to figure out the problem. Below is the process.

# (1) change the pipeline function name, success
# pipeline_func → a-new-pipeline-func-please
a-new-pipeline-func-please-s8tb5-2208969941   # launched pipeline pods
a-new-pipeline-func-please-s8tb5-2839087706

# (2) use the new name, run again, error
a-new-pipeline-func-please-snp4h-1210744786

# (3) change the pipeline function name again, success
# a-new-pipeline-func-please, success → a-new-pipeline-func-please-again
a-new-pipeline-func-please-again-wfq8v-351450082
a-new-pipeline-func-please-again-wfq8v-766454711

# (4) use the new name, run again, success
a-new-pipeline-func-please-again-pcbzd-2732903275
a-new-pipeline-func-please-again-pcbzd-98665068

# (5) use the new name, run again, error
a-new-pipeline-func-please-again-kg4w4-25440416

Here is the official data_pass example.

from kfp.components import InputPath, OutputPath, func_to_container_op
import kfp

kfp_client = kfp.Client()

@func_to_container_op
def repeat_line(line: str, count: int, output_text_path: OutputPath(str)):
    with open(output_text_path, 'w') as writer:
        for i in range(count):
            writer.write(line + '\n')


@func_to_container_op
def print_text(input_text_path: InputPath()):
    with open(input_text_path, 'r') as reader:
        for line in reader:
            print(line, end='')

def a_new_pipeline_func_please_again(line: str = 'Hello', count: int = 10):
    repeat_line_task = repeat_line(line=line, count=count)
    print_text(repeat_line_task.output)


if __name__ == '__main__':
    kfp_client.create_run_from_pipeline_func(
        a_new_pipeline_func_please_again,
        arguments={
            'line': "bbb",
            'count': 5,
        },
    )

Expected result

image

image

About the artifact repository. I find some clues in the kubeflow namespace

$ kubectl get cm -n kubeflow
NAME                                         DATA   AGE
inferenceservice-config                      9      87d
istio-ca-root-cert                           1      160d
kfp-launcher                                 1      150d
kfserving-config                             1      87d
kfserving-models-web-app-config-mtgm8bbd98   1      87d
metadata-grpc-configmap                      2      150d
ml-pipeline-ui-configmap                     1      150d
pipeline-install-config                      15     150d
workflow-controller-configmap                3      150d

$ kubectl get cm workflow-controller-configmap -n kubeflow -o yaml
apiVersion: v1
data:
  artifactRepository: |     # the artifactRepository configuration.
    archiveLogs: true
    s3:
      endpoint: "minio-service.kubeflow:9000"
      bucket: "mlpipeline"
      # keyFormat is a format pattern to define how artifacts will be organized in a bucket.
      # It can reference workflow metadata variables such as workflow.namespace, workflow.name,
      # pod.name. Can also use strftime formating of workflow.creationTimestamp so that workflow
      # artifacts can be organized by date. If omitted, will use `{{workflow.name}}/{{pod.name}}`,
      # which has potential for have collisions, because names do not guarantee they are unique
      # over the lifetime of the cluster.
      # Refer to https://kubernetes.io/docs/concepts/overview/working-with-objects/names/.
      #
      # The following format looks like:
      # artifacts/my-workflow-abc123/2018/08/23/my-workflow-abc123-1234567890
      # Adding date into the path greatly reduces the chance of {{pod.name}} collision.
      keyFormat: "artifacts/{{workflow.name}}/{{workflow.creationTimestamp.Y}}/{{workflow.creationTimestamp.m}}/{{workflow.creationTimestamp.d}}/{{pod.name}}"
      # insecure will disable TLS. Primarily used for minio installs not configured with TLS
      insecure: true
      accessKeySecret:
        name: mlpipeline-minio-artifact
        key: accesskey
      secretKeySecret:
        name: mlpipeline-minio-artifact
        key: secretkey
  containerRuntimeExecutor: docker
  executor: |
    imagePullPolicy: IfNotPresent
    resources:
      requests:
        cpu: 0.01
        memory: 32Mi
      limits:
        cpu: 0.5
        memory: 512Mi
kind: ConfigMap
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"v1","data":{"artifactRepository":"archiveLogs: true\ns3:\n  endpoint: \"minio-service.kubeflow:9000\"\n  bucket: \"mlpipeline\"\n  # keyFormat is a format pattern to define how artifacts will be organized in a bucket.\n  # It can reference workflow metadata variables such as workflow.namespace, workflow.name,\n  # pod.name. Can also use strftime formating of workflow.creationTimestamp so that workflow\n  # artifacts can be organized by date. If omitted, will use `{{workflow.name}}/{{pod.name}}`,\n  # which has potential for have collisions, because names do not guarantee they are unique\n  # over the lifetime of the cluster.\n  # Refer to https://kubernetes.io/docs/concepts/overview/working-with-objects/names/.\n  #\n  # The following format looks like:\n  # artifacts/my-workflow-abc123/2018/08/23/my-workflow-abc123-1234567890\n  # Adding date into the path greatly reduces the chance of {{pod.name}} collision.\n  keyFormat: \"artifacts/{{workflow.name}}/{{workflow.creationTimestamp.Y}}/{{workflow.creationTimestamp.m}}/{{workflow.creationTimestamp.d}}/{{pod.name}}\"\n  # insecure will disable TLS. Primarily used for minio installs not configured with TLS\n  insecure: true\n  accessKeySecret:\n    name: mlpipeline-minio-artifact\n    key: accesskey\n  secretKeySecret:\n    name: mlpipeline-minio-artifact\n    key: secretkey\n","containerRuntimeExecutor":"docker","executor":"imagePullPolicy: IfNotPresent\nresources:\n  requests:\n    cpu: 0.01\n    memory: 32Mi\n  limits:\n    cpu: 0.5\n    memory: 512Mi\n"},"kind":"ConfigMap","metadata":{"annotations":{},"labels":{"application-crd-id":"kubeflow-pipelines"},"name":"workflow-controller-configmap","namespace":"kubeflow"}}
  creationTimestamp: "2021-11-17T03:05:35Z"
  labels:
    application-crd-id: kubeflow-pipelines
  name: workflow-controller-configmap
  namespace: kubeflow
  resourceVersion: "16984168"
  selfLink: /api/v1/namespaces/kubeflow/configmaps/workflow-controller-configmap
  uid: d0b5c55e-ed8a-4b37-835e-6a2b4d3c402c

Please help me. Many thanks!


Impacted by this bug? Give it a 👍. We prioritise the issues with the most 👍.

Shuai-Xie avatar Apr 16 '22 07:04 Shuai-Xie

Maybe it's because the @pipeline decorator is missing?

Linchin avatar Apr 21 '22 22:04 Linchin

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

github-actions[bot] avatar May 03 '24 07:05 github-actions[bot]

This issue has been automatically closed because it has not had recent activity. Please comment "/reopen" to reopen it.

github-actions[bot] avatar May 24 '24 07:05 github-actions[bot]