pykaldi icon indicating copy to clipboard operation
pykaldi copied to clipboard

Install kaldi script fails with missing folder

Open chrisspen opened this issue 2 years ago • 9 comments

The ./install_kaldi.sh script mentioned in the README fails with the following error:

All done OK.
Configuring KALDI to use MKL.
Checking compiler g++ ...
Checking OpenFst library in /home/chris/pykaldi/tools/kaldi/tools/openfst-1.6.7 ...
Performing OS specific configuration ...
On Linux: Checking for linear algebra header files ...
Configuring MKL library directory: ***configure failed: Could not find the MKL library directory.
Please use the switch --mkl-root and/or --mkl-libdir if you have MKL installed,
or try another math library, e.g. --mathlib=OPENBLAS (Kaldi may be slower). ***

Presumably, I'm missing the undocumented library MKL. Where/how should I install that?

chrisspen avatar May 28 '22 23:05 chrisspen

I've added a install_mkl.sh script to the tools directory that should install MKL for you. This would be the fastest option for Intel processors. For AMD, OpenBLAS (https://www.openblas.net/) might be better.

I'm a bit surprised that it says "configured to use MKL" even if you don't have it installed - it should figure out by itself what BLAS library you have (MKL, OpenBLAS, ATLAS etc.). Chances are you had none of those installed?

bmilde avatar May 29 '22 11:05 bmilde

Yeah, I found that the underlying issue is this package requires Intel's MKL library. I have installed OpenBLAS and Atlas in the past, but if those were still installed, why would it error out?

I also found this blog which lead to me this Intel page on installing their MKL Ubuntu repo.

That allowed me to easily install MKL with:

wget -O- https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB | gpg --dearmor | sudo tee /usr/share/keyrings/oneapi-archive-keyring.gpg > /dev/null
echo "deb [signed-by=/usr/share/keyrings/oneapi-archive-keyring.gpg] https://apt.repos.intel.com/oneapi all main" | sudo tee /etc/apt/sources.list.d/oneAPI.list
sudo apt update
sudo apt install intel-basekit

After that, the ./install_kaldi.sh finally completed without error.

However, pykaldi is still broken, and running:

 python -c "from kaldi.asr import NnetLatticeFasterRecognizer"

In my Python3.9 virtualenv gives me the error:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/project/.env/lib/python3.9/site-packages/kaldi/__init__.py", line 14, in <module>
    from . import base
  File "/project/.env/lib/python3.9/site-packages/kaldi/base/__init__.py", line 1, in <module>
    from ._kaldi_error import *
ImportError: /project/.env/lib/python3.9/site-packages/kaldi/base/_kaldi_error.so: undefined symbol: _ZN5kaldi25g_abort_on_assert_failureE

chrisspen avatar May 29 '22 12:05 chrisspen

Have you sourced path.sh as described in the README?

source path.sh

Either you have kaldi installed in the current working dir where you run python or you need to change the paths in path.sh.

Pykaldi is probably not finding your Kaldi installation.

On Sun, May 29, 2022, 2:55 PM Chris Spencer @.***> wrote:

Yeah, I found that the underlying issue is this package requires Intel's MKL library. I have installed OpenBLAS and Atlas in the past, but if those were still installed, why would it error out?

I also found this blog https://deepakbaby.in/post/kaldi-mkl/ which lead to me this Intel page on installing their MKL Ubuntu repo https://www.intel.com/content/www/us/en/develop/documentation/installation-guide-for-intel-oneapi-toolkits-linux/top/installation/install-using-package-managers/apt.html .

That allowed me to easily install MKL with:

wget -O- https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB | gpg --dearmor | sudo tee /usr/share/keyrings/oneapi-archive-keyring.gpg > /dev/null echo "deb [signed-by=/usr/share/keyrings/oneapi-archive-keyring.gpg] https://apt.repos.intel.com/oneapi all main" | sudo tee /etc/apt/sources.list.d/oneAPI.list sudo apt update sudo apt install intel-basekit

After that, the ./install_kaldi.sh finally completed without error.

However, pykaldi is still broken, and running:

python -c "from kaldi.asr import NnetLatticeFasterRecognizer"

In my Python3.9 virtualenv gives me the error:

Traceback (most recent call last): File "", line 1, in File "/project/.env/lib/python3.9/site-packages/kaldi/init.py", line 14, in from . import base File "/project/.env/lib/python3.9/site-packages/kaldi/base/init.py", line 1, in from ._kaldi_error import * ImportError: /project/.env/lib/python3.9/site-packages/kaldi/base/_kaldi_error.so: undefined symbol: _ZN5kaldi25g_abort_on_assert_failureE

— Reply to this email directly, view it on GitHub https://github.com/pykaldi/pykaldi/issues/303#issuecomment-1140443701, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACKGA6X5KPQT3IL2RHATSM3VMNSLFANCNFSM5XHIURFQ . You are receiving this because you commented.Message ID: @.***>

bmilde avatar May 29 '22 13:05 bmilde

If I do that, it just gives me a different error:

(.env) chris@localhost:~/project/tools$ source ./path.sh 
(.env) chris@localhost:~/project/tools$ python -c "from kaldi.asr import NnetLatticeFasterRecognizer"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/chris/project/.env/lib/python3.9/site-packages/kaldi/asr.py", line 14, in <module>
    from . import decoder as _dec
  File "/home/chris/project/.env/lib/python3.9/site-packages/kaldi/decoder/__init__.py", line 1, in <module>
    from ._grammar_fst import *
ImportError: /lib/x86_64-linux-gnu/libm.so.6: version `GLIBC_2.29' not found (required by /home/chris/project/.env/lib/python3.9/site-packages/kaldi/decoder/../fstext/_float_weight.so)

chrisspen avatar May 29 '22 16:05 chrisspen

What OS and version are you using? I guess your glibc is too old. It should work on ubuntu 20.04, it has glibc 2.31, your distro must be older than that.

bmilde avatar May 29 '22 16:05 bmilde

I'm running Ubuntu 18.04 and Python3.9.

chrisspen avatar May 29 '22 17:05 chrisspen

I have GLIBC 2.27 installed. Is there any way to get it to work with that?

chrisspen avatar May 29 '22 17:05 chrisspen

You will need to compile your own pykaldi version then and optionally make your own whl. Or upgrade to 20.04.

On Sun, May 29, 2022, 7:57 PM Chris Spencer @.***> wrote:

I have GLIBC 2.27 installed. Is there any way to get it to work with that?

— Reply to this email directly, view it on GitHub https://github.com/pykaldi/pykaldi/issues/303#issuecomment-1140496321, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACKGA6SK6DAIODJIFVST5HLVMOVXPANCNFSM5XHIURFQ . You are receiving this because you commented.Message ID: @.***>

bmilde avatar May 29 '22 18:05 bmilde

Ok, I upgraded to Ubuntu 20.04, and that fixed the glibc error.

Seems to be working now. The only caveat I'm seeing is it only seems to work from inside the pykaldi/tools directory. If I source the path.sh file from outside the directory or source it inside the directory but then run python from somewhere else, any attempt to access Kaldi fails with:

File "/project/test.py", line 18, in <module>
    from kaldi.asr import NnetLatticeFasterRecognizer
File "/project/.env/lib/python3.9/site-packages/kaldi/__init__.py", line 14, in <module>
    from . import base
File "/project/.env/lib/python3.9/site-packages/kaldi/base/__init__.py", line 1, in <module>
    from ._kaldi_error import *
ImportError: /project/.env/lib/python3.9/site-packages/kaldi/base/_kaldi_error.so: undefined symbol: _ZN5kaldi25g_abort_on_assert_failureE

Is there any work around for this? Not a huge deal, but it would be nice to not have to limit my application to CDing into that directory.

My total install script to get a working Kaldi environment turned out to be:

virtualenv -p python3.9 .env
. .env/bin/activate
pip install numpy
wget http://ltdata1.informatik.uni-hamburg.de/pykaldi/pykaldi-0.2.2-cp39-cp39-linux_x86_64.whl
pip install pykaldi-0.2.2-cp39-cp39-linux_x86_64.whl
git clone https://github.com/pykaldi/pykaldi.git
cd pykaldi/tools/
pip install --upgrade pip
pip install --upgrade setuptools
pip install pyparsing
pip install ninja
./check_dependencies.sh
./install_protobuf.sh
./install_clif.sh
./install_mkl.sh
./install_kaldi.sh
. path.sh
python -c "from kaldi.asr import NnetLatticeFasterRecognizer"

Am I missing a step?

chrisspen avatar May 30 '22 23:05 chrisspen

Just a FYI, you don't need:

./check_dependencies.sh ./install_protobuf.sh ./install_clif.sh

If you are installing from a whl package

bmilde avatar Sep 14 '23 21:09 bmilde