pipelines icon indicating copy to clipboard operation
pipelines copied to clipboard

[sdk] Can't use create_component_from_func with pip packages when running as non-root

Open skogsbrus opened this issue 3 years ago • 11 comments

Environment

  • KFP version: 1.7 (KF 1.4)
  • KFP SDK version: 1.6.6
  • All dependencies version:
kfp                              1.6.6
kfp-pipeline-spec                0.1.13
kfp-server-api                   1.7.1

Steps to reproduce

Background:

  • Due to security concerns it's a bad idea to run containers as root.
  • For composability and maintenance, it's a good idea to define small & modular KFP components.

With this in mind, I wish to report that create_components_from_func does not work as expected when the container is run as a non-root user and when the packages_to_install parameter is used to add some runtime dependencies.

To reproduce, see attached pipeline definition at the bottom.

When this pipeline is run, the following output is seen in Kubeflow:

WARNING: The directory '/.cache/pip' or its parent directory is not owned or is not writable by the current user. The cache has been disabled. Check the permissions and owner of that directory. If executing pip with sudo, you should use sudo's -H flag.
ERROR: Could not install packages due to an OSError: [Errno 13] Permission denied: '/.local'
Check the permissions.
WARNING: The directory '/.cache/pip' or its parent directory is not owned or is not writable by the current user. The cache has been disabled. Check the permissions and owner of that directory. If executing pip with sudo, you should use sudo's -H flag.
ERROR: Could not install packages due to an OSError: [Errno 13] Permission denied: '/.local'
Check the permissions.
Error: exit status 1

Expected result

The correct behaviour here would be for packages to be installed in a location that's writable by non-root users. As a direct consequence, that location would also to have to be added to PYTHONPATH.

With the attached pipeline definition, kfp.components._python_op._get_packages_to_install_command today produces the following yaml:

      - sh
      - -c
      - (PIP_DISABLE_PIP_VERSION_CHECK=1 python3 -m pip install --quiet --no-warn-script-location
        'tqdm' || PIP_DISABLE_PIP_VERSION_CHECK=1 python3 -m pip install --quiet --no-warn-script-location
        'tqdm' --user) && "$0" "$@"
      - sh
      - -ec
      - |
        program_path=$(mktemp)
        printf "%s" "$0" > "$program_path"
        python3 -u "$program_path" "$@"
      - |
        def hello_world():
            import tqdm
            print("Hello world!")

        import argparse
        _parser = argparse.ArgumentParser(prog='Hello world', description='')
        _parsed_args = vars(_parser.parse_args())

        _outputs = hello_world(**_parsed_args)

I propose that kfp.components._python_op._get_packages_to_install_command is changed to instead output:

      - sh
      - -c
      - (PIP_DISABLE_PIP_VERSION_CHECK=1 python3 -m pip install --quiet --no-warn-script-location
        'tqdm' || PIP_DISABLE_PIP_VERSION_CHECK=1 PYTHONUSERBASE=/tmp/pip python3 -m pip install --quiet --no-warn-script-location --cache-dir /tmp/pip-cache
        'tqdm' --user) && "$0" "$@"
      - sh
      - -ec
      - |
        PIP_CUSTOM_FOLDER=$(realpath /tmp/pip/lib/*/site-packages)
        program_path=$(mktemp)
        printf "%s" "$0" > "$program_path"
        PYTHONPATH=$PYTHONPATH:$PIP_CUSTOM_FOLDER python3 -u "$program_path" "$@"
      - |
        def hello_world():
            import tqdm
            print("Hello world")

        import argparse
        _parser = argparse.ArgumentParser(prog='Hello world', description='')
        _parsed_args = vars(_parser.parse_args())

        _outputs = hello_world(**_parsed_args)

This change would accomplish two things:

  • Allow non-root users to install pip packages on the fly
  • Allow non-root users to install packages from cache

A side note: I don't have the historical context of why KFP first tries to install packages as root and on failure as the current user with --user, IMO doing it with --user from the beginning would make more sense. But might be missing something :)

If you agree with the structure of my proposal, I can work on the change - seems like a pretty small fix.

Thanks!

Materials and Reference

Pipeline definition

import argparse
import kfp
import kubernetes

def hello_world():
    import tqdm
    print("Hello world!")

def hello_world_op():
    return kfp.components.create_component_from_func(func=hello_world, packages_to_install=['tqdm'])()

def pipeline():
    component = hello_world_op()
    user_sc = kubernetes.client.models.V1SecurityContext(run_as_user=1234)
    component.set_security_context(user_sc)

def get_args():
    parser = argparse.ArgumentParser()
    parser.add_argument('--yaml', type=str, required=True)
    return parser.parse_args()

def main():
    args = get_args()
    kfp.compiler.Compiler().compile(
        pipeline_func=pipeline,
        package_path=args.yaml)

if __name__ == "__main__":
    main()
    print(kfp.__version__)

Impacted by this bug? Give it a 👍. We prioritise the issues with the most 👍.

skogsbrus avatar Feb 23 '22 12:02 skogsbrus

With KFP 1.8.11 (compared with 1.6.6) the _get_packages_to_install_command has unfortunately been refactored to be performed inline in _func_to_component_spec, making this much harder to monkeypatch.

skogsbrus avatar Mar 07 '22 16:03 skogsbrus

cc @connor-mccarthy, seems like it gets to do with pip_install info.

ji-yaqi avatar Mar 24 '22 22:03 ji-yaqi

@skogsbrus, as a short term fix, can you try adding "--no-cache-dir" as the first element in your packages_to_install array and see if this resolves the issue for you? I have not tested, but I suspect this might work.

connor-mccarthy avatar Mar 24 '22 23:03 connor-mccarthy

@connor-mccarthy I'll be setting up a fork shortly to circumvent this issue, I can confirm then.

But AFAIK that won't work either because a non-root user still does not have permission to write to the default install directory unless the image has explicitly been built with this in mind. The reason why my suggested fix works is because it installs to /tmp

skogsbrus avatar Mar 25 '22 08:03 skogsbrus

Also note that the warning and error are separate:

Can't write to cache (but not fatal)

WARNING: The directory '/.cache/pip' or its parent directory is not owned or is not writable by the current user. The cache has been disabled. Check the permissions and owner of that directory. If executing pip with sudo, you should use sudo's -H flag.

Can't write to install dir:

ERROR: Could not install packages due to an OSError: [Errno 13] Permission denied: '/.local'
Check the permissions.

skogsbrus avatar Mar 25 '22 10:03 skogsbrus

Thanks, @skogsbrus, for clarifying the warning v error. If you manage to remedy this issue, please feel to submit a PR! And perhaps include some notes on pros/cons and alternative solutions considered in this issue to help a reviewer.

connor-mccarthy avatar Mar 25 '22 17:03 connor-mccarthy

I've got a fix for this now, but I'll need to go through an approval process at work due to the CLA before I can contribute.

skogsbrus avatar Apr 01 '22 11:04 skogsbrus

FYI the contribution process has been stuck for a while and I haven't been able to prioritize this. Will update once I am able to contribute.

skogsbrus avatar Aug 30 '22 11:08 skogsbrus

Contribution process has been completed. I'll try to pick this up when I find the time (note: might take a while). In the meantime, feel free to ping me if there's anything I can clarify

skogsbrus avatar Sep 23 '22 11:09 skogsbrus

@skogsbrus did you ever contribute this? I'm hitting a variant of this issue on KFP 2.0.5

gregsheremeta avatar Feb 09 '24 18:02 gregsheremeta

Thanks for the reminder. I no longer work with Kubeflow and never got to contributing this, but I can share the patches that I made to get it working (approval was given, although quite some time after this issue was raised).

This diff is an extract from the patches I made on top of 3d2508e56500a56a9423d45ecae03256b05aa8dd.

Highlights:

  • Default to internal Docker registry
  • Allow installation of pip packages when container is executed by non-root user

Hope it helps.

diff --git a/sdk/python/kfp/components/_python_op.py b/sdk/python/kfp/components/_python_op.py
index b3665a7a4..e871ed81c 100644
--- a/sdk/python/kfp/components/_python_op.py
+++ b/sdk/python/kfp/components/_python_op.py
@@ -48,6 +48,7 @@ import docstring_parser
 T = TypeVar('T')
 
 # InputPath(list) or InputPath('JsonObject')
+DOCKER_REGISTRY = "<replace-with-your-own-docker-registry>"
 
 
 class InputPath:
@@ -152,7 +153,7 @@ def _parent_dirs_maker_that_returns_open_file(mode: str, encoding: str = None):
     return make_parent_dirs_and_return_path
 
 
-default_base_image_or_builder = 'python:3.7'
+default_base_image_or_builder = f'{DOCKER_REGISTRY}/python:3.9-slim'
 
 
 def _python_function_name_to_component_name(name):
@@ -518,7 +519,7 @@ def _func_to_component_spec(func,
 
     Args:
         func: Required. The function to be converted
-        base_image: Optional. Docker image to be used as a base image for the python component. Must have python 3.5+ installed. Default is python:3.7
+        base_image: Optional. Docker image to be used as a base image for the python component.
                     Note: The image can also be specified by decorating the function with the @python_component decorator. If different base images are explicitly specified in both places, an error is raised.
         extra_code: Optional. Python source code that gets placed before the function code. Can be used as workaround to define types used in function signature.
         packages_to_install: Optional. List of [versioned] python packages to pip install before executing the user function.
@@ -541,6 +542,10 @@ def _func_to_component_spec(func,
             base_image = default_base_image_or_builder
             if isinstance(base_image, Callable):
                 base_image = base_image()
+        else:
+            unspecified_origin = base_image.count('/') == 0
+            if unspecified_origin:
+                base_image = f"{DOCKER_REGISTRY}/{base_image}"
 
     packages_to_install = packages_to_install or []
 
@@ -764,11 +769,11 @@ _outputs = {func_name}(**_parsed_args)
 
     package_preinstallation_command = []
     if packages_to_install:
-        package_install_command_line = 'PIP_DISABLE_PIP_VERSION_CHECK=1 python3 -m pip install --quiet --no-warn-script-location {}'.format(
+        package_install_command_line = 'PYTHONUSERBASE=/tmp/pip PIP_DISABLE_PIP_VERSION_CHECK=1 python3 -m pip install --user --quiet --no-warn-script-location {}'.format(
             ' '.join([repr(str(package)) for package in packages_to_install]))
         package_preinstallation_command = [
             'sh', '-c',
-            '({pip_install} || {pip_install} --user) && "$0" "$@"'.format(
+            '{pip_install} && "$0" "$@"'.format(
                 pip_install=package_install_command_line)
         ]
 
@@ -781,6 +786,8 @@ _outputs = {func_name}(**_parsed_args)
                 # Writing the program code to a file.
                 # This is needed for Python to show stack traces and for `inspect.getsource` to work (used by PyTorch JIT and this module for example).
                 textwrap.dedent('''\
+                    CUSTOM_PYTHONPATH="${PYTHONPATH}:$(realpath /tmp/pip/lib/*/site-packages 2>/dev/null)" || CUSTOM_PYTHONPATH=$PYTHONPATH
+                    export PYTHONPATH=$CUSTOM_PYTHONPATH
                     program_path=$(mktemp)
                     printf "%s" "$0" > "$program_path"
                     python3 -u "$program_path" "$@"
@@ -828,7 +835,7 @@ def func_to_component_text(func,
 
     Args:
         func: The python function to convert
-        base_image: Optional. Specify a custom Docker container image to use in the component. For lightweight components, the image needs to have python 3.5+. Default is python:3.7
+        base_image: Optional. Specify a custom Docker container image to use in the component.
         extra_code: Optional. Extra code to add before the function code. Can be used as workaround to define types used in function signature.
         packages_to_install: Optional. List of [versioned] python packages to pip install before executing the user function.
         modules_to_capture: Optional. List of module names that will be captured (instead of just referencing) during the dependency scan. By default the :code:`func.__module__` is captured. The actual algorithm: Starting with the initial function, start traversing dependencies. If the dependency.__module__ is in the modules_to_capture list then it's captured and it's dependencies are traversed. Otherwise the dependency is only referenced instead of capturing and its dependencies are not traversed.
@@ -1005,6 +1012,9 @@ def create_component_from_func(
     '''Converts a Python function to a component and returns a task factory
     (a function that accepts arguments and returns a task object).
 
     Args:
         func: The python function to convert
         base_image: Optional. Specify a custom Docker container image to use in the component. For lightweight components, the image needs to have python 3.5+. Default is the python image corresponding to the current python environment.

skogsbrus avatar Feb 12 '24 21:02 skogsbrus

I posted PR #10538 which addresses this in a different way. Rather than trying to get python to use something other /.local and /.cache, just make /.local and /.cache writeable by doing EmptyDir mounts to those paths. Less of an invasive change than editing the SDK, and seems to work.

gregsheremeta avatar Mar 04 '24 22:03 gregsheremeta

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

github-actions[bot] avatar May 22 '24 07:05 github-actions[bot]