pipelines icon indicating copy to clipboard operation
pipelines copied to clipboard

[backend] <Cannot get MLMD objects from Metadata store. Cannot find context>(version 1.8.0 and 1.10.0)

Open ZDowney926 opened this issue 1 year ago • 2 comments

Environment

  • How did you deploy Kubeflow Pipelines (KFP)? Following the guide of https://github.com/kubeflow/manifests/tree/v1.10-branch
  • KFP version: v2.2.0
  • KFP SDK version: (kfp: 2.4.0 kfp-pipeline-spec: 0.2.2 kfp-server-api: 2.0.3)

Steps to reproduce

Here is a pipeline issue, I try in KubeFlow both v1.8.0 and v1.10.0 ,it both have same issue. I have read relevant issue, and try lots of method, it doesn't work. It seems like MLMD problem?( I don't know is it about database?) I followed the example to build a simple pipeline, here is the code. image

The pipeline.yaml show: image image image

Then, I put the yaml file to pipleline image

I create run of pipeline, then I get the result of this "Cannot get MLMD objects from Metadata store.", if click Details, it shows image image


Impacted by this bug? Give it a 👍.

ZDowney926 avatar Aug 09 '24 10:08 ZDowney926

https://github.com/kubeflow/manifests/tree/v1.10-branch is still under development. I recommend use https://github.com/kubeflow/manifests/tree/v1.9-branch or the v1.9.0 tag.

rimolive avatar Aug 14 '24 19:08 rimolive

I'm encountering the same issue, but from installing only Kubeflow Pipelines as detailed here. Will investigate the full installation.

ESKYoung avatar Aug 14 '24 21:08 ESKYoung

I have the same problem.

nparkstar avatar Sep 01 '24 14:09 nparkstar

I solved my issue at last. I did fresh install on the new machine, and the problem has not appear anymore.

But I think the following instruction is wrong. "Install individual components" of https://github.com/kubeflow/manifests/tree/v1.9.0.

I cannot connect to the central dashboard after installing kubeflow according to the above instruction.

I succeeded after installing using bellow command.

while ! kustomize build example | kubectl apply -f -; do echo "Retrying to apply resources"; sleep 10; done

Thanks,

nparkstar avatar Sep 06 '24 06:09 nparkstar

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

github-actions[bot] avatar Nov 05 '24 07:11 github-actions[bot]

Hi, I am also facing the same issue with kfp upgrade. Could anyone help? Details: kfp-pipeline: 2.3.0 kfp-server-api: 2.0.3 ) kubeflow: I have tried with 1.8, 1.9 Error: Cannot find context with {"typeName":"system.PipelineRun","contextName":"9d29166b-68e1-433f-b506-62a70f0e1a13"}: Cannot find specified context

It is a blocker for us to upgrade to kfp v2.

shivanibhargove avatar Dec 12 '24 11:12 shivanibhargove

@rimolive i can confirm the issue.

juliusvonkohout avatar Dec 12 '24 17:12 juliusvonkohout

I think if you guys send pods' status in kubeflow and related namespaces (auth, admission-controllee, etc.) and also k8s system pods would help a lot. Because these pods are almost like a chain and each pod depends on another one to work and you must find the broken link.

922tech avatar Dec 13 '24 05:12 922tech

Any update on the issue fix? Not able to run pipelines because of this issue.

shivanibhargove avatar Dec 20 '24 07:12 shivanibhargove

I got a pipeline to finish, but there are still errors with ml-metadata image Also no clear errors in the ml-metadata deployments. I think i have to examine the database directly Maybe I know more in a few weeks @rimolive

CC @kubeflow/release-team

juliusvonkohout avatar Jan 06 '25 16:01 juliusvonkohout

It seems to be fixed now. It can happen with very large azure oidc info that you exceed the limit of the grpc server. But most things reported here and in https://github.com/kubeflow/pipelines/issues/8733#issuecomment-2624991816 seem to be from unclean installations or are probably fixed in KFP 2.4.0 or the master branch. If you upgrade to KF 1.9.1 cleanup your istio-system namespace. Also some here had problems with the launcher and driver image which had no proper versioning until 2.4.0 https://github.com/kubeflow/manifests/pull/2953. So please create separate issues focused on a single problem. The master branch is probably also affected by https://github.com/kubeflow/manifests/issues/2970 and we hope to resolve it soon.

/close

juliusvonkohout avatar Jan 31 '25 16:01 juliusvonkohout

@juliusvonkohout: Closing this issue.

In response to this:

It seems to be fixed now. It can happen with very large azure oidc info that you exceed the limit of the grpc server. But most things reported here and in https://github.com/kubeflow/pipelines/issues/8733#issuecomment-2624991816 seem to be from unclean installations or are probably fixed in KFP 2.4.0 or the master branch. If you upgrade from KF 1.9.1 cleanup your istio-system namespace. Also some here had problems with the launcher and driver image which had no proper versioning until 2.4.0 https://github.com/kubeflow/manifests/pull/2953. So please create separate issues focused on a single problem. The master branch is probably also affected by https://github.com/kubeflow/manifests/issues/2970 and we hope to resolve it soon.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

google-oss-prow[bot] avatar Jan 31 '25 16:01 google-oss-prow[bot]