ray icon indicating copy to clipboard operation
ray copied to clipboard

[RuntimeEnv] RAY_RUNTIME_ENV_CREATE_WORKING_DIR not works for the env_vars

Open fyrestone opened this issue 1 year ago • 11 comments

What happened + What you expected to happen

Set an env vars to the runtime env.

"env_vars": {
    "PYTHONPATH": "${RAY_RUNTIME_ENV_CREATE_WORKING_DIR}:${RAY_RUNTIME_ENV_CREATE_WORKING_DIR}/bazel-bin"
}

Then the worker will get

'PYTHONPATH': '/tmp/ray/session_2024-04-10_08-50-17_849822_3886368/runtime_resources/working_dir_files/_ray_pkg_d4bd26c1dfbb01fe::/bazel-bin

Only the first one will be evaluated correctly, the following ones are empty.

Versions / Dependencies

2.10.0

Reproduction script

Please try it by yourself.

Issue Severity

High: It blocks me from completing my task.

fyrestone avatar Apr 10 '24 15:04 fyrestone

from typing import Optional import logging

logger = logging.getLogger(name)

def doc_url(fragment: str) -> str: """Generate the full documentation URL for a given fragment.""" base_url = "https://www.pantsbuild.org/docs/" return f"{base_url}{fragment}"

def softwrap(text: str) -> str: """Soft wrap text for logging.""" return ' '.join(line.strip() for line in text.strip().splitlines())

class FileContent: def init(self, path: str, content: bytes): self.path = path self.content = content

class MyClass: def init(self, args: list[str]): self.args = args

def check_and_warn_if_python_version_configured(self, config: Optional[FileContent]) -> bool:
    """Determine if we can dynamically set `--python-version` and warn if not."""
    configured = []

    if config and b"python_version" in config.content:
        configured.append(
            softwrap(
                f"""
                `python_version` in {config.path} (which is used because of either config
                discovery or the `[mypy].config` option)
                """
            )
        )

    if "--py2" in self.args:
        configured.append("`--py2` in the `--mypy-args` option")

    if any(arg.startswith("--python-version") for arg in self.args):
        configured.append("`--python-version` in the `--mypy-args` option")

    if configured:
        formatted_configured = " and you set ".join(configured)
        logger.warning(
            softwrap(
                f"""
                You set {formatted_configured}. Normally, Pants would automatically set this
                for you based on your code's interpreter constraints
                ({doc_url('python-interpreter-compatibility')}). Instead, it will
                use what you set.

                (Allowing Pants to automatically set the option allows Pants to partition your
                targets by their constraints, so that, for example, you can run MyPy on
                Python 2-only code and Python 3-only code at the same time. It also allows Pants
                to leverage MyPy's cache, making subsequent runs of MyPy very fast.
                In the future, this feature may no longer work.)
                """
            )
        )

    return bool(configured)

naresheslavath60 avatar May 22 '24 04:05 naresheslavath60

May I know why you want to set PYTHONPATH? working_dir should already added itself to PYTHONPATH https://github.com/ray-project/ray/blob/723d6d9f05082540bf075e9f060bfbcd28b85620/python/ray/_private/runtime_env/working_dir.py#L193

rynewang avatar Jul 23 '24 21:07 rynewang

This issue being discussed at: https://ray.slack.com/archives/C01DLHZHRBJ/p1721722954048529

shaikhismail avatar Jul 23 '24 21:07 shaikhismail

May I know why you want to set PYTHONPATH? working_dir should already added itself to PYTHONPATH

https://github.com/ray-project/ray/blob/723d6d9f05082540bf075e9f060bfbcd28b85620/python/ray/_private/runtime_env/working_dir.py#L193

Set some PATHs relative to the working dir.

fyrestone avatar Jul 24 '24 09:07 fyrestone

I'm running into a similar issue. I used bazel to build the docker image. Let's say the name of the image built is ismail-rayserve-apps.

Reproduction

Error: kubectl describe rayservice rayservice-sample-bazel

 serveConfigV2:         applications:
  - name: math_app
    import_path: rayserve_apps.bin.runfiles._main.rubrik.ismail.rayserve_apps.conditional_dag.serve_dag
    route_prefix: /calc
    runtime_env:
      env_vars: {"PYTHONPATH": "/"}
    ...
    Traceback (most recent call last):
  File "/home/ray/anaconda3/lib/python3.9/site-packages/ray/serve/_private/application_state.py", line 1042, in build_serve_application
    app = call_app_builder_with_args_if_necessary(import_attr(import_path), args)
  File "/home/ray/anaconda3/lib/python3.9/site-packages/ray/_private/utils.py", line 1191, in import_attr
    module = importlib.import_module(module_name)
  File "/home/ray/anaconda3/lib/python3.9/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
  File "<frozen importlib._bootstrap>", line 972, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
  File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
  File "<frozen importlib._bootstrap>", line 984, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'rayserve_apps'

        Status:        DEPLOY_FAILED

Even though above config says that PYTHONPATH would be set to /,

Location of conditional_dag.py in docker image

I also found that the conditional_dag.py is at /rayserve-apps.bin.runfiles/_main/rubrik/ismail/rayserve-apps:

(2024-08-08 16:01:35)Ismail.Shaikh@AMER-C91Y2P0MQV:~/work (bazel-rayserve)$ docker run -it ismail-rayserve-apps /bin/bash
WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested
(base) ray@8e4e2c90921d:~$
(base) ray@8e4e2c90921d:~$ find / -iname "conditional_dag.py"
/rayserve-apps.bin.runfiles/_main/rubrik/ismail/rayserve-apps/conditional_dag.py

Source Files and directory structure

ls rayserve-apps/
BUILD.bazel    __pycache__        conditional_dag.py     ray-service.sample.bazel.yaml

BUILD.bazel

load("@io_bazel_rules_docker//container:container.bzl", "container_image", "container_push")

# Add a py_binary rule if you depend on other python files
py_library(
    name = "ray_test_app",
    srcs = ["conditional_dag.py"],  
)

py_image(
    name = "rayserve-apps",
    base = "@image_raybase_2_11_0//image",  # This targets the base image
    srcs = [":ray_test_app"],  # The py_binary target above
    deps = [
        ":ray_test_app",
    ],
    main = "conditional_dag.py",
)

Let's say that bazel build //path_to_target:rayserve-apps and some additional logic that loads the bazel built image into Docker, which eventually produced ismail-rayserve-apps docker image.

ray-service.sample.bazel.yaml

apiVersion: ray.io/v1
kind: RayService
metadata:
  name: rayservice-sample-bazel
spec:
  # serveConfigV2 takes a yaml multi-line scalar, which should be a Ray Serve multi-application config. See https://docs.ray.io/en/latest/serve/multi-app.html.
  serveConfigV2: |
    applications:
      - name: math_app
        import_path: rayserve_apps.bin.runfiles._main.rubrik.ismail.rayserve_apps.conditional_dag.serve_dag
        route_prefix: /calc
        runtime_env:
          env_vars: {"PYTHONPATH": "/"}
        ...

conditional_dag.py

You can assume that the code for this has been taken from workind_dir zip given in https://raw.githubusercontent.com/ray-project/kuberay/v1.1.1/ray-operator/config/samples/ray-service.sample.yaml

Suggested Resolution

1. Set PYTHONPATH in py_image()

py_image(
    name = "rayserve-apps",
    base = "@image_raybase_2_11_0//image",  # This targets the base image
    srcs = [":ray_test_app"],  # The py_binary target above
    deps = [
        ":ray_test_app",
    ],
    env = {
        "PYTHONPATH": "/rayserve-apps.bin.runfiles/_main"
    },
    main = "rubrik.ismail.rayserve_apps.conditional_dag",
)

Questions

  1. Is using runtime_env.env_vars: {"PYTHONPATH": "/"} in the Rayservice yaml to resolve this problem correct? Because I don't see this being set in worker pod:
kubectl exec -it rsample-bazel-raycluster-8djxx-worker-small-group-jpqsg -- bash
Defaulted container "ray-worker" out of: ray-worker, wait-gcs-ready (init)
(base) ray@rsample-bazel-raycluster-8djxx-worker-small-group-jpqsg:~$
(base) ray@rsample-bazel-raycluster-8djxx-worker-small-group-jpqsg:~$ echo $PYTHONPATH

(base) ray@rsample-bazel-raycluster-8djxx-worker-small-group-jpqsg:~$
  1. Are there any other approaches than the Suggested resolution above?

shaikhismail avatar Aug 09 '24 06:08 shaikhismail

@shaikhismail I think the problem you are encountering is about the dots in the file names. See:

The import path

rayserve_apps.bin.runfiles._main.rubrik.ismail.rayserve_apps.conditional_dag.serve_dag

The file path

/rayserve-apps.bin.runfiles/_main/rubrik/ismail/rayserve-apps/conditional_dag.py

Because of the dots in rayserve-apps.bin.runfiles, when Python imports it won't treat rayserve_apps.bin.runfiles as a package name, rather it finds rayserve_apps in / which is a not found. I understand this is from Bazel so maybe we can omit this by:

  1. Remove dots in import path: import_path:_main.rubrik.ismail.rayserve_apps.conditional_dag.serve_dag
  2. Add env_vars: {"PYTHONPATH": "/rayserve-apps.bin.runfiles"}

or you can also set your PYTHONPATH to the really useful working dir /rayserve_apps.bin.runfiles._main.rubrik.ismail.rayserve_apps and set import_path to just conditional_dag.serve_dag.

Please let me know if this works.

Re PYTHONPATH not set in the pod bash: yes it's only set to the Python process we are going to start, not in the whole pod.

rynewang avatar Aug 13 '24 01:08 rynewang

Yes @rynewang , Before I saw your comment, this afternoon I had precisely approached it the way you mentioned and it had worked for me. I came back to Slack to mention what worked for me and then saw your comment. Thanks for taking time to respond.

shaikhismail avatar Aug 13 '24 03:08 shaikhismail

For the issue from @fyrestone I think it's a matter to clarify scope of each env var. RAY_RUNTIME_ENV_CREATE_WORKING_DIR is only set when Ray creates runtime envs, e.g. when it's doing pip install or conda install, intended to support cases when the installation needs to read config files from the working dir. For the running process itself, it should be OK to just use $PWD as it's set to the working dir. That is to say:

ray.init(runtime_env={"env_vars":"PYTHONPATH":"$PYTHONPATH:$PWD/bazel-bin", "working_dir":"/your/dir"})

It's worth noting the $ variables expansion happens when we run the process, not when we create the envs. This means if you have a pip or a conda along with the env_vars, the env_vars won't take effect in pip install or conda install which is why we provided a RAY_RUNTIME_ENV_CREATE_WORKING_DIR. For the worker process use, $PWD should suffice.

To clarify:

env vars in runtime envs (1)

rynewang avatar Aug 13 '24 03:08 rynewang

For the issue from @fyrestone I think it's a matter to clarify scope of each env var. RAY_RUNTIME_ENV_CREATE_WORKING_DIR is only set when Ray creates runtime envs, e.g. when it's doing pip install or conda install, intended to support cases when the installation needs to read config files from the working dir. For the running process itself, it should be OK to just use $PWD as it's set to the working dir. That is to say:

ray.init(runtime_env={"env_vars":"PYTHONPATH":"$PYTHONPATH:$PWD/bazel-bin", "working_dir":"/your/dir"})

It's worth noting the $ variables expansion happens when we run the process, not when we create the envs. This means if you have a pip or a conda along with the env_vars, the env_vars won't take effect in pip install or conda install which is why we provided a RAY_RUNTIME_ENV_CREATE_WORKING_DIR. For the worker process use, $PWD should suffice.

To clarify:

env vars in runtime envs (1)

In my test, os.environ["PWD"] represents the working directory, however, expandvars is not correct, e.g., ${PWD} or $PWD. The PWD variable may be updated after expandvars.

fyrestone avatar Aug 24 '24 10:08 fyrestone

can you share a repro @fyrestone ? I think we expand vars when we run the worker python process

rynewang avatar Aug 24 '24 20:08 rynewang

can you share a repro @fyrestone ? I think we expand vars when we run the worker python process

import ray

ray.init(runtime_env={"working_dir": ".", "env_vars": {"MY_ENV": "$PWD"}})


@ray.remote
def get_my_env():
    import os
    return os.environ["MY_ENV"], os.getcwd()


print(ray.get(get_my_env.remote()))

The os.environ["MY_ENV"] should be the cwd in the worker.

fyrestone avatar Aug 27 '24 10:08 fyrestone

@rynewang I seem to be hitting this same issue on ray 2.44.

The problem I am trying to solve is that I am uploading a bazel runfiles directory as the working_dir and I have a secondary directory within that I need to be able to import from, e.g. external/lib_name, which is within the working_dir I uploaded -- and I need to import lib_name in my python process. When running from within bazel itself this is not a problem because bazel adds a ton of different subdirectories that it creates to the pythonpath, including this external/ dir, but I'm unable to replicate this within ray.

To solve this I tried the same things shown above and none of them work: ${RAY_RUNTIME_ENV_CREATE_WORKING_DIR} is blank, ${PYTHONPATH} is blank, and ${PWD} refers to the wrong directory (it seems to be the root directory the ray worker was launched from rather than the working dir).

What I'm trying to do is essentially this: PYTHONPATH=$PYTHONPATH:$PYTHONPATH/external

JayThomason avatar Mar 28 '25 03:03 JayThomason