k2 icon indicating copy to clipboard operation
k2 copied to clipboard

Installation problem

Open danpovey opened this issue 5 years ago • 23 comments

Guys, how does one debug issues like this?

>>> import k2
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<frozen zipimport>", line 259, in load_module
  File "/ceph-dan/.local/lib/python3.8/site-packages/k2-0.0.1.dev20201110-py3.8.egg/k2/__init__.py", line 1, in <module>
  File "<frozen zipimport>", line 259, in load_module
  File "/ceph-dan/.local/lib/python3.8/site-packages/k2-0.0.1.dev20201110-py3.8.egg/k2/autograd.py", line 8, in <module>
ModuleNotFoundError: No module named '_k2'

I built this locally. In the install script I replaced python3 setup.py bdist_wheel with python3 setup.py install --user --prefix=` IDK if the $ORIGIN thing has anything to do with bdist_wheel specifically.

danpovey avatar Nov 10 '20 11:11 danpovey

.. also when I created a directory foo and unzipped the .egg file there, this is what I got:

de-74279-k2-dev-0929153814-7686b86bc5-qbfjq:foo: ls
EGG-INFO  _k2.cpython-38-x86_64-linux-gnu.so      k2                  libcontext.so  libgtest_maind.so  libtest_utils.so
LICENSE   _k2host.cpython-38-x86_64-linux-gnu.so  k2-0.0.1.dev20201110-py3.8.egg  libfsa.so  libgtestd.so
de-74279-k2-dev-0929153814-7686b86bc5-qbfjq:foo: 

.. so it has a _k2 library.

danpovey avatar Nov 10 '20 11:11 danpovey

... but that's being imported inside the k2/ directory, while it's one level up from there. So I guess I don't undertsand how it knows to look there (?)

danpovey avatar Nov 10 '20 11:11 danpovey

/ceph-dan/.local/lib/python3.8/site-packages should contain the file _k2.cpython-38-x86_64-linux-gnu.so and I think Python runtime will add /ceph-dan/.local/lib/python3.8/site-packages to the search path for finding libraries.

csukuangfj avatar Nov 10 '20 11:11 csukuangfj

It doesn't contain anything starting with _k2. I notice setup.py doesn't mention anything about _k2 (?) Shouldn't it?

danpovey avatar Nov 10 '20 11:11 danpovey

python3 setup.py bdist_wheel will copy everything inside build/lib to the final whl file. _k2***.so is inside build/lib.

Can you use scripts/build_pip.sh to generate a wheel file and use pip install?

csukuangfj avatar Nov 10 '20 12:11 csukuangfj

I want to understand why it's not picking it up even though the _k2 lib is inside the .egg file that it installed. I'd like to make it installable by just doing python3 setup.py install. Surely that's supposed to be possible? Perhaps we need to modify the setup.py so it "knows about" the _k2 library?

danpovey avatar Nov 10 '20 12:11 danpovey

I manage to get it to work by building the wheel and installing that. Perhaps we can revisit this later.

danpovey avatar Nov 10 '20 12:11 danpovey

I met this problem not finding _k2. I found that pip install directly install the py39 k2 and my python version is py37. Then I specify the py37 k2 and it works

songtaoshi avatar Mar 13 '23 09:03 songtaoshi

I met this problem not finding _k2. I found that pip install directly install the py39 k2 and my python version is py37. Then I specify the py37 k2 and it works

I met this too, how do you specify the version of k2?

JunZhan2000 avatar Mar 25 '23 13:03 JunZhan2000

@guokr233 Please visit

  • https://k2-fsa.github.io/k2/installation/pre-compiled-cpu-wheels-linux/index.html

or

  • https://k2-fsa.github.io/k2/installation/pre-compiled-cuda-wheels-linux/index.html

to download pre-compiled wheels.

csukuangfj avatar Mar 25 '23 13:03 csukuangfj

@guokr233 Please visit

  • https://k2-fsa.github.io/k2/installation/pre-compiled-cpu-wheels-linux/index.html

or

  • https://k2-fsa.github.io/k2/installation/pre-compiled-cuda-wheels-linux/index.html

to download pre-compiled wheels.

which one shoud I use? my python is 3.10.9, pytorch is 1.13.1, cuda is 11.7, and I pip install the k2-1.23.4.dev20230224+cuda11.7.torch1.13.1-cp310-cp310-linux_x86_64.whl in https://k2-fsa.github.io/k2/installation/pre-compiled-cuda-wheels-linux/1.13.1.html, however there is another error: ImportError: libpython3.10.so.1.0: cannot open shared object file: No such file or directory image

JunZhan2000 avatar Mar 25 '23 13:03 JunZhan2000

How did you install Python?

If you installed Python from source, please pass

--enable-shared

to

./configure

For instance, an example to install Python 3.10.9 is given below

cd /tmp/
wget https://www.python.org/ftp/python/3.10.9/Python-3.10.9.tgz
tar xzf Python-3.10.9.tgz
cd Python-3.10.9

sudo ./configure --prefix=/opt/python/3.10.9/ --enable-optimizations --with-lto --with-computed-gotos --with-system-ffi --enable-shared
sudo make -j "$(nproc)"
sudo make altinstall
sudo rm /tmp/Python-3.10.9.tgz

(The above installation example is from https://www.build-python-from-source.com/ )

csukuangfj avatar Mar 25 '23 15:03 csukuangfj

If you installed Python with conda, please run

find $CONDA_PREFIX -name "libpython*.so*"

and show the output.

csukuangfj avatar Mar 25 '23 15:03 csukuangfj

I first run "conda create --name k2 python==3.8", then run "pip install ./torch-2.0.1+cu117-cp38-cp38-linux_x86_64.whl", which "torch" is downloaded from “https://download.pytorch.org/whl/torch_stable.html”. But when run "import torch", the bug occured as follow: """ Traceback (most recent call last): File "/ceph/luhongxuan/code/k2/icefall/egs/aishell/ASR/./local/compute_fbank_aishell.py", line 31, in import torch File "/data_asr02/luhongxuan/anaconda3/envs/k2/lib/python3.9/site-packages/torch/init.py", line 218, in from torch._C import * # noqa: F403 ImportError: /data_asr02/luhongxuan/anaconda3/envs/k2/lib/python3.9/site-packages/torch/lib/libshm.so: undefined symbol: _ZNK2at22RefcountedMapAllocator4dataEv """

lucy9527 avatar Sep 25 '23 11:09 lucy9527

@lucy9527 Please don't use conda install to install k2.

Please follow https://k2-fsa.github.io/k2/installation/index.html to install k2.

For instance, you can use https://k2-fsa.github.io/k2/installation/from_wheels.html

csukuangfj avatar Sep 25 '23 12:09 csukuangfj

@csukuangfj I don't use "conda install, I install "torch" and "k2" as instructed in "https://k2-fsa.github.io/k2/installation/from_wheels.html

lucy9527 avatar Sep 25 '23 12:09 lucy9527

Have you installed k2 before? Is there only a single version of k2 in your current virtual environment?

csukuangfj avatar Sep 25 '23 12:09 csukuangfj

"conda create --name k2 python==3.8",

You said that you created a virtural environment with python 3.8.

However, the log says you are using python 3.9.

ImportError: /data_asr02/luhongxuan/anaconda3/envs/k2/lib/python3.9/site-packages/torch/lib/libshm.so: undefined symbol: _ZNK2at22RefcountedMapAllocator4dataEv

csukuangfj avatar Sep 25 '23 12:09 csukuangfj

I suggest that you create a new virtual environment and install torch and k2 from scratch.

If there are any issues, please post the screenshots of

  • how you create the virtual environment
  • how you install torch
  • how you install k2

csukuangfj avatar Sep 25 '23 12:09 csukuangfj

@csukuangfj Sorry, I can't upload the picture. The follow steps refer to https://k2-fsa.github.io/k2/installation/from_wheels.html Firstly, Run conda create --name k3 python==3.8 Second, Download pre-compiled torch wheels from "https://download.pytorch.org/whl/torch_stable.html",Then run "pip install ./torch-2.0.1+cu117-cp38-cp38-linux_x86_64.whl"

Installing collected packages: mpmath, lit, cmake, typing-extensions, sympy, networkx, MarkupSafe, filelock, jinja2, triton, torch
Successfully installed MarkupSafe-2.1.3 cmake-3.27.5 filelock-3.12.4 jinja2-3.1.2 lit-17.0.1 mpmath-1.3.0 networkx-3.1 sympy-1.12 torch-2.0.1+cu117 triton-2.0.0 typing-extensions-4.8.0

Third,

Run python in cmd:

Python 3.8.0 (default, Nov  6 2019, 21:49:08) 
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch 
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "****/anaconda3/envs/k3/lib/python3.8/site-packages/torch/__init__.py", line 229, in <module>
    from torch._C import *  # noqa: F403
ImportError: ****/anaconda3/envs/k3/lib/python3.8/site-packages/torch/lib/libc10_cuda.so: undefined symbol: _ZN3c104impl8GPUTrace13gpuTraceStateE

conda list

# packages in environment at ****/anaconda3/envs/k3:
#
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                        main    defaults
_openmp_mutex             5.1                       1_gnu    defaults
ca-certificates           2023.08.22           h06a4308_0    defaults
cmake                     3.27.5                   pypi_0    pypi
filelock                  3.12.4                   pypi_0    pypi
jinja2                    3.1.2                    pypi_0    pypi
libedit                   3.1.20221030         h5eee18b_0    defaults
libffi                    3.2.1             hf484d3e_1007    defaults
libgcc-ng                 11.2.0               h1234567_1    defaults
libgomp                   11.2.0               h1234567_1    defaults
libstdcxx-ng              11.2.0               h1234567_1    defaults
lit                       17.0.1                   pypi_0    pypi
markupsafe                2.1.3                    pypi_0    pypi
mpmath                    1.3.0                    pypi_0    pypi
ncurses                   6.4                  h6a678d5_0    defaults
networkx                  3.1                      pypi_0    pypi
openssl                   1.1.1w               h7f8727e_0    defaults
pip                       23.2.1           py38h06a4308_0    defaults
python                    3.8.0                h0371630_2    defaults
readline                  7.0                  h7b6447c_5    defaults
setuptools                68.0.0           py38h06a4308_0    defaults
sqlite                    3.33.0               h62c20be_0    defaults
sympy                     1.12                     pypi_0    pypi
tk                        8.6.12               h1ccaba5_0    defaults
torch                     2.0.1+cu117              pypi_0    pypi
triton                    2.0.0                    pypi_0    pypi
typing-extensions         4.8.0                    pypi_0    pypi
wheel                     0.38.4           py38h06a4308_0    defaults
xz                        5.4.2                h5eee18b_0    defaults
zlib                      1.2.13               h5eee18b_0    defaults

lucy9527 avatar Sep 26 '23 02:09 lucy9527

@lucy9527 Thanks for posting the detailed information.

I just looked at the error:

ImportError: ****/anaconda3/envs/k3/lib/python3.8/site-packages/torch/lib/libc10_cuda.so: undefined symbol: _ZN3c104impl8GPUTrace13gpuTraceStateE

Screenshot 2023-09-26 at 11 36 17

Please run

nm /your/path/anaconda3/envs/k3/lib/python3.8/site-packages/torch/lib/libc10.so | grep _ZN3c104impl8GPUTrace13gpuTraceStateE

and post the output.

csukuangfj avatar Sep 26 '23 03:09 csukuangfj

@csukuangfj ,I can't upload the screenshot, the running result as follow:

U _ZN3c104impl8GPUTrace13gpuTraceStateE

lucy9527 avatar Sep 26 '23 11:09 lucy9527

Could you also show the command you are using?

csukuangfj avatar Sep 26 '23 12:09 csukuangfj