pipelines icon indicating copy to clipboard operation
pipelines copied to clipboard

[bug] Widen urllib3 to 2.** causes GCP component failures

Open rafcis02 opened this issue 8 months ago • 5 comments

Environment

  • How do you deploy Kubeflow Pipelines (KFP)? Google VertexAI
  • KFP SDK version: 2.13.0

Steps to reproduce

Using google_cloud_pipeline_components.v1.dataproc.DataprocPySparkBatchOp as a component cause failure.

Image

That's because the urllib3 versions has been widen in https://github.com/kubeflow/pipelines/pull/11819 and newer version of it introduces breaking changes. The code https://github.com/kubeflow/pipelines/blob/fe51dfd7920e45291fc9c5bc30162c12b4bb626b/components/google-cloud/google_cloud_pipeline_components/container/v1/dataproc/utils/dataproc_util.py#L76-L81 uses urllib3.util.retry.Retry which with newest version (2.4.0) does not have method_whitelist constructor argument: https://urllib3.readthedocs.io/en/2.4.0/reference/urllib3.util.html#urllib3.util.Retry

Expected result

The aforementioned component will work with latest version of urllib3. Make sure other components are not affected. Or the aforementioned PR will be reverted to use version <2.0.0

Materials and reference

Labels

/area components


Impacted by this bug? Give it a 👍.

rafcis02 avatar Apr 30 '25 10:04 rafcis02

Hey @rafcis02, thanks for raising this issue. I do not think we're going to revert the PR that introduced the change as most of these components that live within the repo have not been maintained by their contributors but rather abandoned. Also I do not believe our CI handles testing for these either. That being said, you're more than welcomed to contribute a fix for this component.

zazulam avatar Apr 30 '25 11:04 zazulam

Thanks @zazulam for the response, but I see that there are still new releases in those component. https://github.com/kubeflow/pipelines/pull/11869

@chensun I see you did latest release for those GCP components 2.20, which causes the issue. For 2.19 everything works fine. I think that's because the constraint for kfp was extended from <2.11.0 to <2.13.0 and that introduced the urllib3=2.4 which is incompatible with the components code. Maybe just adding constraint there for urllib3<2.0.0 would solve the issue, so the output docker image will contain compatible version with the code of the component

rafcis02 avatar Apr 30 '25 12:04 rafcis02

I would assume updating the parameters sent to the Retry instantiation would be best rather than forcing the pin to a lower version if this component is going to continue living in this repo.

zazulam avatar Apr 30 '25 13:04 zazulam

Thanks @zazulam for the response, but I see that there are still new releases in those component. #11869

@chensun I see you did latest release for those GCP components 2.20, which causes the issue. For 2.19 everything works fine. I think that's because the constraint for kfp was extended from <2.11.0 to <2.13.0 and that introduced the urllib3=2.4 which is incompatible with the components code. Maybe just adding constraint there for urllib3<2.0.0 would solve the issue, so the output docker image will contain compatible version with the code of the component

If that's the case, then you could pin kfp sdk to an earlier version to workaround the issue. And we can track this bug and fix it in the next gcpc release.

chensun avatar Jun 03 '25 16:06 chensun

Thanks @chensun I needed to pin down google-cloud-pipeline-components to 2.19 as that issue comes from docker image where that sdk is installed

rafcis02 avatar Jun 09 '25 07:06 rafcis02