[bug] Widen urllib3 to 2.** causes GCP component failures
Environment
- How do you deploy Kubeflow Pipelines (KFP)? Google VertexAI
- KFP SDK version: 2.13.0
Steps to reproduce
Using google_cloud_pipeline_components.v1.dataproc.DataprocPySparkBatchOp as a component cause failure.
That's because the urllib3 versions has been widen in https://github.com/kubeflow/pipelines/pull/11819 and newer version of it introduces breaking changes.
The code https://github.com/kubeflow/pipelines/blob/fe51dfd7920e45291fc9c5bc30162c12b4bb626b/components/google-cloud/google_cloud_pipeline_components/container/v1/dataproc/utils/dataproc_util.py#L76-L81 uses urllib3.util.retry.Retry which with newest version (2.4.0) does not have method_whitelist constructor argument:
https://urllib3.readthedocs.io/en/2.4.0/reference/urllib3.util.html#urllib3.util.Retry
Expected result
The aforementioned component will work with latest version of urllib3. Make sure other components are not affected. Or the aforementioned PR will be reverted to use version <2.0.0
Materials and reference
Labels
/area components
Impacted by this bug? Give it a 👍.
Hey @rafcis02, thanks for raising this issue. I do not think we're going to revert the PR that introduced the change as most of these components that live within the repo have not been maintained by their contributors but rather abandoned. Also I do not believe our CI handles testing for these either. That being said, you're more than welcomed to contribute a fix for this component.
Thanks @zazulam for the response, but I see that there are still new releases in those component. https://github.com/kubeflow/pipelines/pull/11869
@chensun I see you did latest release for those GCP components 2.20, which causes the issue. For 2.19 everything works fine. I think that's because the constraint for kfp was extended from <2.11.0 to <2.13.0 and that introduced the urllib3=2.4 which is incompatible with the components code. Maybe just adding constraint there for urllib3<2.0.0 would solve the issue, so the output docker image will contain compatible version with the code of the component
I would assume updating the parameters sent to the Retry instantiation would be best rather than forcing the pin to a lower version if this component is going to continue living in this repo.
Thanks @zazulam for the response, but I see that there are still new releases in those component. #11869
@chensun I see you did latest release for those GCP components 2.20, which causes the issue. For 2.19 everything works fine. I think that's because the constraint for kfp was extended from <2.11.0 to <2.13.0 and that introduced the urllib3=2.4 which is incompatible with the components code. Maybe just adding constraint there for urllib3<2.0.0 would solve the issue, so the output docker image will contain compatible version with the code of the component
If that's the case, then you could pin kfp sdk to an earlier version to workaround the issue. And we can track this bug and fix it in the next gcpc release.
Thanks @chensun I needed to pin down google-cloud-pipeline-components to 2.19 as that issue comes from docker image where that sdk is installed