marimo icon indicating copy to clipboard operation
marimo copied to clipboard

import order matters when running as `marimo edit my-notebook.py` vs `python my-notebook.py` on the same environment

Open colobas opened this issue 1 year ago • 9 comments

Describe the bug

First of all, this is an amazing framework and I'm actively working on porting my notebooks to it. Congrats and thank you for this incredible contribution to the community.

In porting one of my notebooks, I kept running into an error importing faiss which didn't happen on jupyter. I was able to strip it down to a very minimal example which currently fails when I run it as marimo edit ... but runs fine when I run python ....

Basically if I import pandas before faiss I get a missing library error:

This cell raised an exception: ImportError('/lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.29' not found (required by /opt/conda/lib/python3.12/site-packages/faiss/_swigfaiss.so)')

I was able to see that when I import faiss after pandas I get to this point in faiss's init logic, whereas if I run it with python ... OR if I import pandas after faiss this try-except block is succesful and that prevents me from arriving at the previously linked spot.

If I turn logging on, I can tell that when I get to the error the two try-except blocks in the code linked above also err because of the same library not found:

INFO:faiss.loader:Loading faiss with AVX512 support.
INFO:faiss.loader:Could not load library with AVX512 support due to:
ImportError("/lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.29' not found (required by /opt/conda/lib/python3.12/site-packages/faiss/_swigfaiss_avx512.so)")
INFO:faiss.loader:Loading faiss with AVX2 support.
INFO:faiss.loader:Could not load library with AVX2 support due to:
ImportError("/lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.29' not found (required by /opt/conda/lib/python3.12/site-packages/faiss/_swigfaiss_avx2.so)")
INFO:faiss.loader:Loading faiss.

It's really strange to me that importing pandas before faiss ultimately is what triggers this, but that it doesn't happen outside of marimo edit ... in the exact same environment...

Any guidance is highly appreciated

Environment

{
  "marimo": "0.8.18",
  "OS": "Linux",
  "OS Version": "5.14.0-427.33.1.el9_4.x86_64",
  "Processor": "x86_64",
  "Python Version": "3.12.6",
  "Binaries": {
    "Browser": "--",
    "Node": "--"
  },
  "Dependencies": {
    "click": "8.1.7",
    "importlib-resources": "6.4.5",
    "jedi": "0.19.1",
    "markdown": "3.7",
    "pygments": "2.18.0",
    "pymdown-extensions": "10.9",
    "ruff": "0.6.6",
    "starlette": "0.38.5",
    "tomlkit": "0.13.2",
    "typing-extensions": "4.12.2",
    "uvicorn": "0.30.6",
    "websockets": "12.0"
  },
  "Optional Dependencies": {
    "pandas": "2.2.3",
    "pyarrow": "17.0.0"
  }
}

Code to reproduce

import logging
logging.basicConfig(level="DEBUG")

import pandas as pd
import faiss

colobas avatar Sep 24 '24 00:09 colobas

@colobas thanks for reporting and the detailed issue. this is definitely an odd bug. we will look into this next

mscolnick avatar Sep 24 '24 00:09 mscolnick

Additional context (versions):

$ micromamba list | grep faiss
  faiss-gpu           1.8.0      py3.12_hedc54c9_0_cuda11.4.4  pytorch    
  libfaiss            1.8.0      h5aaf3ed_0_cuda11.4.4         pytorch  
$ uv pip show pandas
Name: pandas
Version: 2.2.3
Location: /opt/conda/lib/python3.12/site-packages
Requires: numpy, python-dateutil, pytz, tzdata
Required-by: anndata, cell-gears, datasets, geoparse, mlflow, sangha, scanpy, scgpt, scib, seaborn, statsmodels

colobas avatar Sep 24 '24 01:09 colobas

@colobas , thanks for reporting. I couldn't reproduce this on macOS, but there may be platform-specific issues. This weekend I can try on linux with faiss-gpu.

In the meantime, in case you haven't already, can you try this is a brand new environment?

I think this StackOverflow thread could have the answer in there somewhere: https://stackoverflow.com/questions/77939924/importing-pandas-and-cplex-in-a-conda-environment-raises-an-importerror-libstdc

Also in the meantime, I wonder if you can get around this by setting LD_LIBRARY_PATH as suggested in the linked-to post

akshayka avatar Sep 24 '24 02:09 akshayka

@akshayka this was a fresh env inside a docker (apptainer actually) container.

And you're right that StackOverflow thread definitely seems to be getting at the issue. And indeed setting LD_LIBRARY_PATH resolves the issue. So it seems that the different behavior of python ... and marimo edit ... boils down to how each of them is influencing RPATH?

colobas avatar Sep 24 '24 14:09 colobas

It's peculiar, I'm not sure. marimo edit runs the notebook in a separate process. If you try marimo run without setting the environment variable, do you get the same issue?

We don't have any logic related to PATH/RPATH/environment variables. And marimo edit itself should be the same as python -m marimo edit.

akshayka avatar Sep 24 '24 14:09 akshayka

Just tried marimo run and python -m marimo run and python -m marimo edit and they all fail, vs just python ....

I added code to print the env vars from inside the script and here's what I get for python ... vs marimo .... They all look the same to me, aside from when I manually set LD_LIBRARY_PATH. Notably, when I run it with python ... the value of LD_LIBRARY_PATH is the same as it is when marimo run ... fails

python test-import.py (runs with no errors)

UV_CACHE_DIR=/tmp
MAMBA_USER_ID=57439
ENV_NAME=base
MAMBA_USER=mambauser
SINGULARITY_NAME=container.sif
SINGULARITY_ENVIRONMENT=/.singularity.d/env/91-environment.sh
PWD=/home/gpires
CONDA_PREFIX=/opt/conda
MAMBA_ROOT_PREFIX=/opt/conda
APPTAINER_ENVIRONMENT=/.singularity.d/env/91-environment.sh
APPTAINER_APPNAME=
HOME=/home/gpires
LANG=C.UTF-8
APPTAINER_COMMAND=exec
SINGULARITY_CONTAINER=/home/gpires/net-seq-gpires/compbio-containers/containers/scGPT/container.sif
UV_LINK_MODE=copy
APPTAINER_CONTAINER=/home/gpires/net-seq-gpires/compbio-containers/containers/scGPT/container.sif
MAMBA_USER_GID=57439
MAMBA_EXE=/bin/micromamba
TERM=screen
SHLVL=1
MAMBA_PKGS_DIR=/tmp/mamba-pkgs-cache
APPTAINER_NAME=container.sif
SINGULARITY_BIND=
APPTAINER_BIND=
LD_LIBRARY_PATH=/.singularity.d/libs
PS1=Apptainer> 
LC_ALL=C.UTF-8
PATH=/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
_=/opt/conda/bin/python

marimo run test-import.py (runs into ImportError described above)

UV_CACHE_DIR=/tmp                                                                                                                                                           
MAMBA_USER_ID=57439                                                                                                                                                         
ENV_NAME=base
MAMBA_USER=mambauser
SINGULARITY_NAME=container.sif
SINGULARITY_ENVIRONMENT=/.singularity.d/env/91-environment.sh
PWD=/home/gpires
CONDA_PREFIX=/opt/conda
MAMBA_ROOT_PREFIX=/opt/conda
APPTAINER_ENVIRONMENT=/.singularity.d/env/91-environment.sh
APPTAINER_APPNAME=
HOME=/home/gpires
LANG=C.UTF-8
APPTAINER_COMMAND=exec
SINGULARITY_CONTAINER=/home/gpires/net-seq-gpires/compbio-containers/containers/scGPT/container.sif
UV_LINK_MODE=copy
APPTAINER_CONTAINER=/home/gpires/net-seq-gpires/compbio-containers/containers/scGPT/container.sif
MAMBA_USER_GID=57439
MAMBA_EXE=/bin/micromamba
TERM=screen
SHLVL=1
MAMBA_PKGS_DIR=/tmp/mamba-pkgs-cache
APPTAINER_NAME=container.sif
SINGULARITY_BIND=
APPTAINER_BIND=
LD_LIBRARY_PATH=/.singularity.d/libs
PS1=Apptainer> 
LC_ALL=C.UTF-8
PATH=/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
_=/opt/conda/bin/marimo

LD_LIBRARY_PATH=/opt/conda/lib:$LD_LIBRARY_PATH marimo edit test-import.py (runs without errors)

LD_LIBRARY_PATH=/opt/conda/lib:/.singularity.d/libs
UV_CACHE_DIR=/tmp
MAMBA_USER_ID=57439
ENV_NAME=base
MAMBA_USER=mambauser
SINGULARITY_NAME=container.sif
SINGULARITY_ENVIRONMENT=/.singularity.d/env/91-environment.sh
PWD=/home/gpires
CONDA_PREFIX=/opt/conda
MAMBA_ROOT_PREFIX=/opt/conda
APPTAINER_ENVIRONMENT=/.singularity.d/env/91-environment.sh
APPTAINER_APPNAME=
HOME=/home/gpires
LANG=C.UTF-8
APPTAINER_COMMAND=exec
SINGULARITY_CONTAINER=/home/gpires/net-seq-gpires/compbio-containers/containers/scGPT/container.sif
UV_LINK_MODE=copy
APPTAINER_CONTAINER=/home/gpires/net-seq-gpires/compbio-containers/containers/scGPT/container.sif
MAMBA_USER_GID=57439
MAMBA_EXE=/bin/micromamba
TERM=screen
SHLVL=1
MAMBA_PKGS_DIR=/tmp/mamba-pkgs-cache
APPTAINER_NAME=container.sif
SINGULARITY_BIND=
APPTAINER_BIND=
PS1=Apptainer> 
LC_ALL=C.UTF-8
PATH=/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
_=/opt/conda/bin/marimo

colobas avatar Sep 24 '24 15:09 colobas

Yea, they're the same, thanks for investigating. So the issue is not the environment variables.

It's very perplexing that it works on Jupyter but not marimo.

To the extent we do anything special, it's here (patching the main module): https://github.com/marimo-team/marimo/blob/main/marimo/_runtime/patches.py#L119-L169

I wonder if at some point it would be appropriate to file an issue with conda, at least to get help

akshayka avatar Sep 24 '24 17:09 akshayka

could you try setting LD_DEBUG=all before the invocations and see if there is a difference in output? Note that it produces a ton of output.

Also, I could not get the error, but I am new to conda (used micromamba) so if you can spell out how to reproduce this?

I've tried:

❯ micromamba --version
1.5.8
micromamba create -n 2395env
micromamba activate 2395env
micromamba install pandas marimo pillow faiss-cpu libfaiss pyparsing pytorch -c conda-forge
cat > breaks.py <<EOF
import marimo

__generated_with = "0.8.19"
app = marimo.App(width="medium")


@app.cell
def __():
    import logging
    logging.basicConfig(level="DEBUG")

    import pandas as pd
    import faiss
    return faiss, logging, pd


@app.cell
def __():
    return


if __name__ == "__main__":
    app.run()
EOF
python breaks.py

output

DEBUG:faiss.loader:Environment variable FAISS_OPT_LEVEL is not set, so let's pick the instruction set according to the current CPU
INFO:faiss.loader:Loading faiss with AVX2 support.
INFO:faiss.loader:Could not load library with AVX2 support due to:
ModuleNotFoundError("No module named 'faiss.swigfaiss_avx2'")
INFO:faiss.loader:Loading faiss.
INFO:faiss.loader:Successfully loaded faiss.

I see that I don't have the avx2 shared library installed, so I guess there is a different package for it?

running with LD_DEBUG=all produces a million lines

❯ LD_DEBUG=all python breaks.py 2>&1 | wc -l
1077613

alefminus avatar Sep 26 '24 05:09 alefminus

@alefminus everything looks correct to me. the main difference is I'm using faiss-gpu=1.8.0 and I think the cpu version isn't built with avx2 support by default (see e.g. this)

I'll provide the output with LD_DEBUG=all shortly

colobas avatar Sep 26 '24 14:09 colobas