pipelines icon indicating copy to clipboard operation
pipelines copied to clipboard

[backend] Cannot get MLMD objects from Metadata store: received initial metadata size exceeds limit

Open vkatrychenko opened this issue 9 months ago • 3 comments

Environment

  • How did you deploy Kubeflow Pipelines (KFP)? deployed kubeflow using k8s manifests (standalone kubeflow components): https://github.com/kubeflow/manifests
  • KFP version: kfp 2.4.1
  • KFP SDK version: kfp-sdk: 2.12.1, 2.11.0
  • MySQL server: 8.0.21 Azure flexible mysql (also, tried to use self-hosted mysql v8.0.26)

Steps to reproduce

Deploy the latest kubeflow ver v1.10.0-rc.2 with dex/oauth2-proxy auth. Set up Microsoft auth method for dex. Then try to run a pipeline.

Image

When running pipelines under a static kf user, everything is working fine.

Image

Expected result

No errors should appear under pipeline-ui.

Materials and Reference

It seems that the issue occurs due to large azure oidc info according to this thread: https://github.com/kubeflow/pipelines/issues/8733#issuecomment-2627771197

Tried to downgrade/upgrade grpc ml medata service, but did not help.

Do you have a workaround for this?

BTW: we saw this error in v1.9.x as well


Impacted by this bug? Give it a 👍.

vkatrychenko avatar Mar 11 '25 15:03 vkatrychenko

For context: everything is working fine in kubeflow 1.7.1.

vkatrychenko avatar Mar 14 '25 10:03 vkatrychenko

Fixed by adding the argument to metadata-grpc-deployment:

spec:
  ...
  template:
    ...
    spec:
      containers:
        - args:
            - '--grpc_channel_arguments=grpc.max_metadata_size=16384' <<-- (16kb instead of default 8kb)

NJ3rsey avatar Jun 09 '25 13:06 NJ3rsey

Fixed by adding the argument to metadata-grpc-deployment:

spec:
  ...
  template:
    ...
    spec:
      containers:
        - args:
            - '--grpc_channel_arguments=grpc.max_metadata_size=16384' <<-- (16kb instead of default 8kb)

@hbelmiro @HumairAK should we make this the default, maybe even 32k ? See also https://github.com/kubeflow/manifests/tree/master/common/oauth2-proxy#known-issues

juliusvonkohout avatar Jun 12 '25 15:06 juliusvonkohout

We appear to have a user running into the same issue. Agree with @juliusvonkohout that 16kb should be the default.

droctothorpe avatar Jul 17 '25 18:07 droctothorpe

Confirmed the fix worked for us. Tysm @NJ3rsey!

droctothorpe avatar Jul 17 '25 18:07 droctothorpe