images icon indicating copy to clipboard operation
images copied to clipboard

Library search path related issue in latest miniconda env

Open bpkroth opened this issue 11 months ago • 4 comments

As described in https://github.com/microsoft/MLOS/pull/699, there appears to be a problem with the latest miniconda devcontainer image. Somewhere in the stack additional environments added to it end up searching in the base debian image /usr/lib path for libc.so instead of those installed with conda (e.g., /opt/conda/lib or /opt/conda/envs/$env_name/lib).

Rolling back to 0.203.6-3 fixes the issue as does explicitly setting LD_LIBRARY_PATH=/opt/conda/lib.

bpkroth avatar Mar 06 '24 18:03 bpkroth

Thanks for reporting! @gauravsaini04 can you prioritize investigating the issue? thanks!

samruddhikhandale avatar Mar 06 '24 18:03 samruddhikhandale

FYI, I made a smaller reproduction here: https://github.com/bpkroth/miniconda-devcontainer-test

I think it might actually have something to do with pandas being installed from conda and pyarrow being installed from pip.

In the original example this was unintentional, but here I did it explicitly.

Anyways, still unclear why the base devcontainer change should affect that.

bpkroth avatar Mar 07 '24 19:03 bpkroth

Thanks for reporting! @gauravsaini04 can you prioritize investigating the issue? thanks!

sure @samruddhikhandale, searching for cause of the issue

gauravsaini04 avatar Mar 08 '24 10:03 gauravsaini04

Thanks @bpkroth for providing us more information. The repro repo is pretty helpful, conda env update -n base -v -f environment.yml command definitely fails with the latest image

I think it might actually have something to do with pandas being installed from conda and pyarrow being installed from pip.

The only change that went into latest version of the miniconda image is https://github.com/devcontainers/images/pull/976 which did something similar to ^ ; where cryptography installation was moved from conda to pip. I am not confident if that's the root cause, however, @gauravsaini04 I can think of few ways to debug the root cause -

  1. Can you use https://github.com/bpkroth/miniconda-devcontainer-test for repro, and instead of using the base image, can you build the miniconda image with reverted https://github.com/devcontainers/images/pull/976 changes?
  2. Can you check if there was a newer release of continuumio/miniconda3 which could have caused this?

samruddhikhandale avatar Mar 27 '24 01:03 samruddhikhandale

@prathameshzarkar9, had been working on this issue. He has been testing the issue with the following scenarios, as suggested by you : Scenario #1 : Test with the following environment.yml in the test project by @bpkroth,

channels:
  - defaults
dependencies:
  - python
  - pandas
  - pytest
  - pip
  - pyarrow
- Sub-scenario #1 : Test with the old miniconda upstream image i.e. in 0.203.6-3
- Sub-scenario #2 : Test with the latest miniconda upstream image i.e. 0.203.7

Scenario #2 : Test with the following environment.yml,

channels:
  - defaults
dependencies:
 - pip:
   - python
   - pandas
   - pytest
   - pip
   - pyarrow
- Sub-scenario #1 : Testing with the old miniconda image 
- Sub-scenario #2 : Testing with the new miniconda image

Scenario #3 : Testing the default environment.yml i.e.

channels:
  - defaults
dependencies:
  - python
  - pandas
  - pytest
  - pip
  - pip:
    - pyarrow
- Sub-scenario #1 : Testing with old miniconda image
- Sub-scenario #2 : Testing with new miniconda image

if all these scenarios fail with the similar error message, then it implies that the issue is obviously related to the upstream miniconda image (released just after the 0.203.6-3 i.e. the image present in our 0.203.7 image).

gauravsaini04 avatar May 22 '24 04:05 gauravsaini04

Hi @gauravsaini04 @samruddhikhandale ,

I have tested all the scenarios mentioned above.


Scenario 1 - sub scenario 1:

using mcr.microsoft.com/devcontainers/miniconda:0.203.6-3 to build miniconda test project with following environment.yml

channels:
  - defaults
dependencies:
  - python
  - pandas
  - pytest
  - pip
  - pyarrow

Set the variable: LD_LIBRARY_PATH=/opt/conda/lib

Executed: test.sh

Output: conda run -n base pytest test_exec_example.py ============================= test session starts ============================== platform linux -- Python 3.11.9, pytest-7.4.0, pluggy-1.0.0 rootdir: /tmp/conda-tmp collected 1 item

test_exec_example.py . [100%]

============================== 1 passed in 0.38s ===============================

Scenario 1 - sub scenario 2:

using mcr.microsoft.com/devcontainers/miniconda:0.203.7-3 to build miniconda test project with following environment.yml

channels:
  - defaults
dependencies:
  - python
  - pandas
  - pytest
  - pip
  - pyarrow

Set the variable: LD_LIBRARY_PATH=/opt/conda/lib

Executed: test.sh

Output: conda run -n base pytest test_exec_example.py ============================= test session starts ============================== platform linux -- Python 3.12.3, pytest-7.4.0, pluggy-1.0.0 rootdir: /tmp/conda-tmp collected 1 item

test_exec_example.py . [100%]

============================== 1 passed in 0.39s ===============================

Scenario 2 - sub scenario 1:

using mcr.microsoft.com/devcontainers/miniconda:0.203.6-3 to build miniconda test project with following environment.yml

channels:
  - defaults
dependencies:
 - pip:
   - python
   - pandas
   - pytest
   - pip
   - pyarrow

Set the variable: LD_LIBRARY_PATH=/opt/conda/lib

Executed: test.sh

Output: conda env update -n base -v -f environment.yml Warning: you have pip-installed dependencies in your environment file, but you do not list pip itself as one of your conda dependencies. Conda may not use the correct pip to install your packages, and they may end up in the wrong place. Please add an explicit pip dependency. I'm adding one for you, but still nagging you. Installing pip dependencies: ...working... Ran pip subprocess with arguments: ['/opt/conda/bin/python', '-m', 'pip', 'install', '-U', '-r', '/tmp/conda-tmp/condaenv.ht5xghx7.requirements.txt', '--exists-action=b'] Pip subprocess output:

Pip subprocess error: ERROR: Could not find a version that satisfies the requirement python (from versions: none) ERROR: No matching distribution found for python

failed

CondaEnvException: Pip failed

Scenario 2 - sub scenario 2:

using mcr.microsoft.com/devcontainers/miniconda:0.203.7-3 to build miniconda test project with following environment.yml

channels:
  - defaults
dependencies:
 - pip:
   - python
   - pandas
   - pytest
   - pip
   - pyarrow

Set the variable: LD_LIBRARY_PATH=/opt/conda/lib

Executed: test.sh

Output: conda env update -n base -v -f environment.yml Warning: you have pip-installed dependencies in your environment file, but you do not list pip itself as one of your conda dependencies. Conda may not use the correct pip to install your packages, and they may end up in the wrong place. Please add an explicit pip dependency. I'm adding one for you, but still nagging you. Installing pip dependencies: ...working... Ran pip subprocess with arguments: ['/opt/conda/bin/python', '-m', 'pip', 'install', '-U', '-r', '/tmp/conda-tmp/condaenv.ht5xghx7.requirements.txt', '--exists-action=b'] Pip subprocess output:

Pip subprocess error: ERROR: Could not find a version that satisfies the requirement python (from versions: none) ERROR: No matching distribution found for python

failed

CondaEnvException: Pip failed

Scenario 3 - sub scenario 1:

using mcr.microsoft.com/devcontainers/miniconda:0.203.6-3 to build miniconda test project with following environment.yml

channels:
  - defaults
dependencies:
  - python
  - pandas
  - pytest
  - pip
  - pip:
    - pyarrow

Set the variable: LD_LIBRARY_PATH=/opt/conda/lib

Executed: test.sh

Output: done

To activate this environment, use conda activate base

To deactivate an active environment, use conda deactivate

conda run -n base pytest test_exec_example.py ============================= test session starts ============================== platform linux -- Python 3.11.9, pytest-7.4.0, pluggy-1.0.0 rootdir: /tmp/conda-tmp collected 1 item

test_exec_example.py F [100%]

=================================== FAILURES =================================== ______________________________ test_exec_example _______________________________

def test_exec_example():
    env = {}
    env["PATH"] = os.environ["PATH"]
    env["foo"] = "bar"
    result = run(["./example.py"],
        capture_output=True,
        text=True,
        check=False,
        env=env,
        cwd=os.path.realpath(os.path.dirname(__file__))
    )
   result.check_returncode()

test_exec_example.py:15:


self = CompletedProcess(args=['./example.py'], returncode=1, stdout='', stderr='Traceback (most recent call last):\n File "/...quired by /opt/conda/lib/python3.11/site-packages/pandas/_libs/window/aggregations.cpython-311-x86_64-linux-gnu.so)\n')

def check_returncode(self):
    """Raise CalledProcessError if the exit code is non-zero."""
    if self.returncode:
       raise CalledProcessError(self.returncode, self.args, self.stdout,
                                 self.stderr)

E subprocess.CalledProcessError: Command '['./example.py']' returned non-zero exit status 1.

/opt/conda/lib/python3.11/subprocess.py:502: CalledProcessError =========================== short test summary info ============================ FAILED test_exec_example.py::test_exec_example - subprocess.CalledProcessErro... ============================== 1 failed in 0.32s ===============================

ERROR conda.cli.main_run:execute(125): conda run pytest test_exec_example.py failed. (See above for error)

Scenario 3 - sub scenario 2:

using mcr.microsoft.com/devcontainers/miniconda:0.203.7-3 to build miniconda test project with following environment.yml

channels:
  - defaults
dependencies:
  - python
  - pandas
  - pytest
  - pip
  - pip:
    - pyarrow

Set the variable: LD_LIBRARY_PATH=/opt/conda/lib

Executed: test.sh

Output: done

To activate this environment, use conda activate base

To deactivate an active environment, use conda deactivate

conda run -n base pytest test_exec_example.py ============================= test session starts ============================== platform linux -- Python 3.12.3, pytest-7.4.0, pluggy-1.0.0 rootdir: /tmp/conda-tmp collected 1 item

test_exec_example.py F [100%]

=================================== FAILURES =================================== ______________________________ test_exec_example _______________________________

def test_exec_example():
    env = {}
    env["PATH"] = os.environ["PATH"]
    env["foo"] = "bar"
    result = run(["./example.py"],
        capture_output=True,
        text=True,
        check=False,
        env=env,
        cwd=os.path.realpath(os.path.dirname(__file__))
    )
   result.check_returncode()

test_exec_example.py:15:


self = CompletedProcess(args=['./example.py'], returncode=1, stdout='', stderr='Traceback (most recent call last):\n File "/...quired by /opt/conda/lib/python3.12/site-packages/pandas/_libs/window/aggregations.cpython-312-x86_64-linux-gnu.so)\n')

def check_returncode(self):
    """Raise CalledProcessError if the exit code is non-zero."""
    if self.returncode:
       raise CalledProcessError(self.returncode, self.args, self.stdout,
                                 self.stderr)

E subprocess.CalledProcessError: Command '['./example.py']' returned non-zero exit status 1.

/opt/conda/lib/python3.12/subprocess.py:502: CalledProcessError =========================== short test summary info ============================ FAILED test_exec_example.py::test_exec_example - subprocess.CalledProcessErro... ============================== 1 failed in 0.34s ===============================

ERROR conda.cli.main_run:execute(125): conda run pytest test_exec_example.py failed. (See above for error)


Hence, from the above tests, we can conclude below 2 things:

  1. we need to set the environment variable in miniconda project dockerfile as : LD_LIBRARY_PATH=/opt/conda/lib
  2. whoever is using this image, they have to make sure that the environment.yml file should contain all the packages under conda channel itself and not under the pip channel

Please find the PR created for these changes: PR #1075

prathameshzarkar9 avatar May 22 '24 11:05 prathameshzarkar9

Hi @samruddhikhandale ,

It is important to note that setting LD_LIBRARY_PATH can have unintended consequences, as it can cause the dynamic linker to find and load a different version of a library than the one the program was built with. This can potentially lead to unexpected behavior or crashes. Therefore, it is generally recommended to avoid using LD_LIBRARY_PATH if possible, and to instead install any required libraries in the standard system directories. Additionally, it is generally better to set the LD_LIBRARY_PATH variable in a per-user or per-application basis, rather than modifying the system-wide /etc/environment file. This can be done by adding the export line to the user's shell initialization file, such as ~/.bashrc or ~/.bash_profile.

Best Practices To use LD_LIBRARY_PATH effectively and safely:

  • Minimize Scope: Set LD_LIBRARY_PATH only in the environment of the specific application that needs it, not globally.
  • Wrapper Scripts: Use wrapper scripts to set LD_LIBRARY_PATH for applications, ensuring other parts of the system remain unaffected.
  • Temporary Use: Use it temporarily during development, testing, or troubleshooting, and remove it for production deployments.
  • System Configuration: For long-term solutions, prefer modifying system-wide library configurations using ld.so.conf and running ldconfig.

What should not set this variable globally as it is not recommended. This variable should be handled as per user requirements in their specific user context.

prathameshzarkar9 avatar May 24 '24 17:05 prathameshzarkar9

Thank you @prathameshzarkar9 for the detailed summary and investigation. Great findings 👏

Let’s close https://github.com/devcontainers/images/pull/1075

Closing this issue as we won’t be able to fix this on our end, and directly setting LD_LIBRARY_PATH in the image is risky as stated in https://github.com/devcontainers/images/issues/989#issuecomment-2130005143.

whoever is using this image, they have to make sure that the environment.yml file should contain all the packages under conda channel itself and not under the pip channel

@bpkroth Can you continue using LD_LIBRARY_PATH in your dev config or ensure you follow ^ to fix this issue?

Let's reopen this issue if needed, thank you!

samruddhikhandale avatar May 24 '24 23:05 samruddhikhandale