audio
audio copied to clipboard
Build failure for v0.10.2 in nvidia/cuda:11.3.1-cudnn8-devel-ubuntu20.04
Repro Instructions:
git clone --recursive https://github.com/pytorch/audio.git
cd audio
git checkout v0.10.2
export USE_CUDA=1
export BUILD_SOX=1
CC=gcc-9 CXX=g++-9 python3.8 setup.py bdist_wheel
python3.8 -m pip install dist/*.whl
python3.8 -c 'import torchaudio'
Issue
The following warnings are displayed on import
/audio/torchaudio/_extension.py:11: UserWarning: torchaudio C++ extension is not availabl$
.
warnings.warn('torchaudio C++ extension is not available.')
/audio/torchaudio/backend/utils.py:67: UserWarning: No audio backend is available.
warnings.warn('No audio backend is available.')
Logs
Hi @Lokiiiiii
Can you try python3.8 -c 'import torchaudio'
outside of the cloned directory and see if the warning is gone?
The build log seems to be fine. I do not see any issue/failure.
Since the installation command is pip install dist/*.whl
, the resulting binary will go into install location like ../site-packages/torchaudio/...
. However, when you do python3.8 -c 'import torchaudio'
in the cloned repo, the source directory will shadow the installed one, and import source directory, which does not have the built extension.
Thanks, however I am still seeing an error:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/opt/conda/lib/python3.8/site-packages/torchaudio/__init__.py", line 1, in <module>
from torchaudio import _extension # noqa: F401
File "/opt/conda/lib/python3.8/site-packages/torchaudio/_extension.py", line 27, in <module>
_init_extension()
File "/opt/conda/lib/python3.8/site-packages/torchaudio/_extension.py", line 21, in _init_extension
torch.ops.load_library(path)
File "/opt/conda/lib/python3.8/site-packages/torch/_ops.py", line 110, in load_library
ctypes.CDLL(path)
File "/opt/conda/lib/python3.8/ctypes/__init__.py", line 373, in __init__
self._handle = _dlopen(self._name, mode)
OSError: /opt/conda/lib/python3.8/site-packages/torchaudio/lib/libtorchaudio.so: undefined symbol: _ZNK5torch8autograd4Node4nameB5cxx11Ev
I see couple of issues.
OSError: /opt/conda/lib/python3.8/site-packages/torchaudio/lib/libtorchaudio.so: undefined symbol: _ZNK5torch8autograd4Node4nameB5cxx11Ev
In general, this error happens when PyTorch binary and torchaudio binary do not match. torchaudio needs to use the matching extension module (the module written in C++ and compiled).
However, looking at the path where the issue happens, torchaudio/lib/libtorchaudio.so
indicates this is version 0.11. The version you built and installed is version 0.10.2 and it is supposed to have torchaudio/_torchaudio.so
. (The build log you pointed also confirms this.)
So I suggest to uninstall all the torchaudio you have in your env (repeat pip uninstall torchaudio
and conda uninstall torchaudio
), then make sure you have the version of PyTorch you want to use, (and make sure there is only one version in the env), then try building torchaudio again.
I just did a complete build from source on a fresh environment. I built torch, torch_xla, torchvision and finally torchaudio all from source. Still getting the same error:
Environment
root@1c84aa84490d:/# pip list | grep torch
torch 1.10.2
torch-xla 1.10.0
torchaudio 0.10.1+6f539cf
torchvision 0.11.0a0+05eae32
Error
root@1c84aa84490d:/# python -c 'import torch, torch_xla, torchvision, torchaudio'
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/opt/conda/lib/python3.8/site-packages/torchaudio/__init__.py", line 1, in <module>
from torchaudio import _extension # noqa: F401
File "/opt/conda/lib/python3.8/site-packages/torchaudio/_extension.py", line 27, in <module>
_init_extension()
File "/opt/conda/lib/python3.8/site-packages/torchaudio/_extension.py", line 21, in _init_extension
torch.ops.load_library(path)
File "/opt/conda/lib/python3.8/site-packages/torch/_ops.py", line 110, in load_library
ctypes.CDLL(path)
File "/opt/conda/lib/python3.8/ctypes/__init__.py", line 373, in __init__
self._handle = _dlopen(self._name, mode)
OSError: /opt/conda/lib/python3.8/site-packages/torchaudio/lib/libtorchaudio.so: undefined symbol: _ZNK5torch8autograd4Node4nameB5cxx11Ev
Hmm, all the cases of this kinds of error I have seen so are basically the mismatch of the PyTorch version. If your environment has only one version of PyTorch then it does not explain.
Something interesting about the error message is it has cxx11Ev
at the end. It might be about ABI mismatch.
(There is an interesting report about it https://github.com/linkinpark213/linkinpark213.github.io/issues/12#issuecomment-456251560)
My next hypothesis is that PyTorch and torchaudio are compiled with different ABI settings. (The different values of _GLIBCXX_USE_CXX11_ABI
or could be different compiler) although I would expect torchaudio compilation to fail in that case.
If you run the following command, what do you get?
It tries to find a symbol with torch8autograd4Node4name
from PyTorch C++ library files.
nm `python -c 'import torch;print("/".join(torch.__file__.split("/")[:-1]))'`/lib/libtorch* | grep torch8autograd4Node4name
root@1c84aa84490d:/# nm `python -c 'import torch;print("/".join(torch.__file__.split("/")[:-1]))'`/lib/libtorch* | grep torch8autograd4Node4name
0000000003fbe140 T _ZNK5torch8autograd4Node4nameEv
U _ZNK5torch8autograd4Node4nameEv
Will do a fresh build of torchaudio explicitly setting _GLIBCXX_USE_CXX11_ABI=0
I am building both torch and torchaudio with gcc-9.
Issue persists after
CFLAGS="${CFLAGS} -D_GLIBCXX_USE_CXX11_ABI=0" CXXFLAGS="${CXXFLAGS} -D_GLIBCXX_USE_CXX11_ABI=0" CC="gcc-9" CXX="g++-9" BUILD_SOX=1 python setup.py bdist_wheel
root@1c84aa84490d:/# nm `python -c 'import torch;print("/".join(torch.__file__.split("/")[:-1]))'`/lib/libtorch* | grep torch8autograd4Node4name 0000000003fbe140 T _ZNK5torch8autograd4Node4nameEv U _ZNK5torch8autograd4Node4nameEv
Will do a fresh build of torchaudio explicitly setting
_GLIBCXX_USE_CXX11_ABI=0
I am building both torch and torchaudio with gcc-9.
Okay, at least we know that the torchaudio is expecting the PyTorch library compiled with cxx11 ABI, which is why it's failing.
Issue persists after
CFLAGS="${CFLAGS} -D_GLIBCXX_USE_CXX11_ABI=0" CXXFLAGS="${CXXFLAGS} -D_GLIBCXX_USE_CXX11_ABI=0" CC="gcc-9" CXX="g++-9" BUILD_SOX=1 python setup.py bdist_wheel
Can you check the build log and see if the value of _GLIBCXX_USE_CXX11_ABI
is reflected?
The thing is that torchaudio's build process should be detecting the configuration that PyTorch was compiled.
https://github.com/pytorch/audio/blob/6f539cf3edc4224b51798e962ca28519e5479ffb/CMakeLists.txt#L125-L126
So it should not be necessary to set the flag manually. However, this part is not well-tested, so this might be bug.
Build logs @ https://pastebin.com/d4X6WCrn indicate _GLIBCXX_USE_CXX11_ABI=0
is reflected.
hmm, if that's the case, I do not have an idea what is causing the error.
As a reference, I tried compiling the v0.10.2
on nvidia/cuda:11.3.1-cudnn8-devel-ubuntu20.04
and it worked fine.
https://gist.github.com/mthrok/de45e817a1b5f9475bcdac0cee464de6
I can repro your build when using a pre-built torch binary. But the build starts failing when I install torch from source.
First error I face is
gcc -DHAVE_CONFIG_H -I. -I/audio/build/temp.linux-x86_64-3.8/third_party/sox/src/lame/fro
ntend -I.. -I/audio/build/temp.linux-x86_64-3.8/third_party/sox/src/lame/libmp3lame -I/aud
io/build/temp.linux-x86_64-3.8/third_party/sox/src/lame/include -I.. -Wall -pipe -I/aud
io/third_party/sox/../install/include -fvisibility=hidden -D_GLIBCXX_USE_CXX11_ABI=0
-c /audio/build/temp.linux-x86_64-3.8/third_party/sox/src/lame/frontend/console.c
/bin/bash: /root/anaconda3/envs/prod/lib/libtinfo.so.6: no version information available (
required by /bin/bash)
/audio/build/temp.linux-x86_64-3.8/third_party/sox/src/lame/frontend/console.c:25:11: fata
l error: curses.h: No such file or directory
25 | # include <curses.h>
| ^~~~~~~~~~
compilation terminated.
make[2]: *** [Makefile:396: console.o] Error 1
make[2]: Leaving directory '/audio/build/temp.linux-x86_64-3.8/third_party/sox/src/lame-bu
ild/frontend'
make[1]: *** [Makefile:349: all-recursive] Error 1
make[1]: Leaving directory '/audio/build/temp.linux-x86_64-3.8/third_party/sox/src/lame-bu
ild'
I can get past that with apt-get install libncurses-dev
.
Currently working on
libtool: link: gcc -Wall -pipe -I/audio/third_party/sox/../install/include -fvisibility=hi
dden -D_GLIBCXX_USE_CXX11_ABI=0 -o lame lame_main.o main.o brhist.o console.o get_audio.o
lametime.o parse.o timestatus.o -L/audio/third_party/sox/../install/lib ../libmp3lame/.li
bs/libmp3lame.a -lncurses -lm
/usr/bin/ld: console.o: in function `get_termcap_string':
console.c:(.text+0x102): undefined reference to `tgetstr'
/usr/bin/ld: console.o: in function `get_termcap_number':
console.c:(.text+0x183): undefined reference to `tgetnum'
/usr/bin/ld: console.o: in function `apply_termcap_settings':
console.c:(.text+0x20a): undefined reference to `tgetent'
collect2: error: ld returned 1 exit status
make[2]: *** [Makefile:358: lame] Error 1
make[2]: Leaving directory '/audio/build/temp.linux-x86_64-3.8/third_party/sox/src/lame-bu
ild/frontend'
make[1]: *** [Makefile:349: all-recursive] Error 1
make[1]: Leaving directory '/audio/build/temp.linux-x86_64-3.8/third_party/sox/src/lame-bu
ild'
make: *** [Makefile:276: all] Error 2
CMake Error at /audio/build/temp.linux-x86_64-3.8/third_party/sox/src/lame-stamp/lame-buil
d-Release.cmake:47 (message):
Stopping after outputting logs.
Repro Instructions
conda create -y --name py38 python=3.8 anaconda
conda activate py38
conda install -y numpy pyyaml mkl-include setuptools cmake cffi typing tqdm coverage tensorboard hypothesis dataclasses
export CFLAGS="${CFLAGS} -D_GLIBCXX_USE_CXX11_ABI=0"
export CXXFLAGS="${CXXFLAGS} -D_GLIBCXX_USE_CXX11_ABI=0"
export USE_CUDA=1
git clone https://github.com/pytorch/pytorch.git
pushd pytorch && git checkout v1.10.2
git submodule update --init --recursive
sed -i 's/set(CUDA_PROPAGATE_HOST_FLAGS OFF)//g' third_party/gloo/cmake/Cuda.cmake
USE_SYSTEM_NCCL=1 python setup.py install
git clone https://github.com/pytorch/audio.git
pip install ninja
pushd audio && git checkout v0.10.2
BUILD_SOX=1 python setup.py install
Can confirm this only happens when D_GLIBCXX_USE_CXX11_ABI=0
Hi @Lokiiiiii
Sorry for the late reply.
export CFLAGS="${CFLAGS} -D_GLIBCXX_USE_CXX11_ABI=0" export CXXFLAGS="${CXXFLAGS} -D_GLIBCXX_USE_CXX11_ABI=0"
This does not seem to be the proper way to enable CXX11 ABI in PyTorch.
I think the proper way is to set _GLIBCXX_USE_CXX11_ABI
environment variable.
https://github.com/pytorch/pytorch/blob/38a758e25178d70362c1a2d900e9f7c27e70af28/tools/setup_helpers/cmake.py#L243-L252
TorchAudio fetches the information of CXX11 ABI via TORCH_CXX_FLAGS
variable in CMake.
Which is propagated from the _GLIBCXX_USE_CXX11_ABI
environment variable.
Can you try building PyTorch with setting the _GLIBCXX_USE_CXX11_ABI
environment variable instead of manipulating CFLAGS
and CXXFLAGS
?
Can we do something like https://github.com/pytorch/FBGEMM/blob/0e24712210b44a3adf3832f9f9bfb1e486d81f4f/fbgemm_gpu/setup.py#L50 when building the torchaudio binary? fwiw, pytorch nightly is built with GLIBCXX_USE_CXX11_ABI=0
. See more discussion at https://github.com/pytorch/pytorch/pull/100262#issuecomment-1542140819
Can we do something like https://github.com/pytorch/FBGEMM/blob/0e24712210b44a3adf3832f9f9bfb1e486d81f4f/fbgemm_gpu/setup.py#L50 when building the torchaudio binary? fwiw, pytorch nightly is built with
GLIBCXX_USE_CXX11_ABI=0
. See more discussion at pytorch/pytorch#100262 (comment)
This line was covering that, but I think it's no longer working I guess. https://github.com/pytorch/audio/blob/4463fbdfbbc29fbc78d5dcd4f61cd9d0a806432c/CMakeLists.txt#L130-L139
This line was covering that, but I think it's no longer working I guess.
https://github.com/pytorch/audio/blob/4463fbdfbbc29fbc78d5dcd4f61cd9d0a806432c/CMakeLists.txt#L130-L139
Probably. torchtext also does this, https://github.com/pytorch/text/blob/b0ebddc648d279826089db91775375221777a2db/tools/setup_helpers/extension.py#LL25C37-L25C37