wav2letter
wav2letter copied to clipboard
For lexicon_free recipe: After creating a container, it is does not pass the test
Bug Description
I am trying to reproduce the lexicon free speech recognition on librispeech dataset (clean). So as per the given instructions recipes/models/lexicon_free I followed them and created the container using the docker image. After creating the container, when I verified the test, it is not able to pass all the test.
Reproduction Steps
Note: I am mounting my local volume to store the data.
- sudo docker run --runtime=nvidia --rm -itd --ipc=host --volume /home/amitm/lexfree:/root/lexfree --name lexfree wav2letter/wav2letter:lexfree
- sudo docker exec -it lexfree bash
- cd /root/wav2letter/build && make test
Error Message
Running tests... Test project /root/wav2letter/build Start 1: W2lCommonTest 1/21 Test 1: W2lCommonTest ....................***Exception: SegFault 2.26 sec Start 2: DictionaryTest 2/21 Test 2: DictionaryTest ................... Passed 0.04 sec Start 3: CriterionTest 3/21 Test 3: CriterionTest ....................***Exception: SegFault 1.38 sec Start 4: Seq2SeqTest 4/21 Test 4: Seq2SeqTest ......................***Exception: SegFault 1.41 sec Start 5: AttentionTest 5/21 Test 5: AttentionTest ....................***Exception: SegFault 1.78 sec Start 6: WindowTest 6/21 Test 6: WindowTest .......................***Exception: SegFault 1.36 sec Start 7: DataTest 7/21 Test 7: DataTest .........................***Failed 1.45 sec Start 8: ListFileDatasetTest 8/21 Test 8: ListFileDatasetTest .............. Passed 1.31 sec Start 9: SoundTest 9/21 Test 9: SoundTest ........................ Passed 0.13 sec Start 10: DecoderTest 10/21 Test 10: DecoderTest ...................... Passed 1.28 sec Start 11: CeplifterTest 11/21 Test 11: CeplifterTest .................... Passed 0.04 sec Start 12: DctTest 12/21 Test 12: DctTest .......................... Passed 0.08 sec Start 13: DerivativesTest 13/21 Test 13: DerivativesTest .................. Passed 0.04 sec Start 14: DitherTest 14/21 Test 14: DitherTest ....................... Passed 8.05 sec Start 15: MfccTest 15/21 Test 15: MfccTest ......................... Passed 0.38 sec Start 16: PreEmphasisTest 16/21 Test 16: PreEmphasisTest .................. Passed 0.04 sec Start 17: SpeechUtilsTest 17/21 Test 17: SpeechUtilsTest ..................***Failed 1.74 sec Start 18: TriFilterbankTest 18/21 Test 18: TriFilterbankTest ................ Passed 0.07 sec Start 19: WindowingTest 19/21 Test 19: WindowingTest .................... Passed 0.04 sec Start 20: W2lModuleTest 20/21 Test 20: W2lModuleTest ....................***Exception: SegFault 3.11 sec Start 21: RuntimeTest 21/21 Test 21: RuntimeTest ......................***Failed 10.03 sec
57% tests passed, 9 tests failed out of 21
Total Test time (real) = 36.02 sec
The following tests FAILED: 1 - W2lCommonTest (SEGFAULT) 3 - CriterionTest (SEGFAULT) 4 - Seq2SeqTest (SEGFAULT) 5 - AttentionTest (SEGFAULT) 6 - WindowTest (SEGFAULT) 7 - DataTest (Failed) 17 - SpeechUtilsTest (Failed) 20 - W2lModuleTest (SEGFAULT) 21 - RuntimeTest (Failed) Errors while running CTest Makefile:104: recipe for target 'test' failed make: *** [test] Error 8
Note
Additional Context
Please note that, when I did it for normal wav2letter installation with docker as given in the wiki, all the test were working fine and I was able to reproduce the results as well
- sudo docker run --runtime=nvidia --rm -itd --ipc=host --volume /home/amitm/w2l:/root/w2l --name w2l wav2letter/wav2letter:cuda-latest
- sudo docker exec -it w2l bash
- cd /root/wav2letter/build && make test And all test were passed!
I run the same commands:
sudo docker run --runtime=nvidia --rm -itd --ipc=host --name lexfree wav2letter/wav2letter:lexfree
sudo docker exec -it lexfree bash
cd /root/wav2letter/build && make test
everything passes for me, no errors. Could you directly run each failed test and post here the errors (like in the build dir ./src/tests/W2lCommonTest
)?
I ran the following command:
cd /root/wav2letter/build/src/tests/ && make W2lCommonTest
and got the following outputs on the screen:
loading initial cache file /root/wav2letter/build/src/tests/googletest/tmp/gtest-cache-Release.cmake -- Configuring done -- Generating done -- Build files have been written to: /root/wav2letter/build/src/tests/googletest/src/gtest [ 0%] Performing build step for 'gtest' [ 25%] Built target gtest [ 50%] Built target gmock [ 75%] Built target gmock_main [100%] Built target gtest_main [ 0%] No install step for 'gtest' [ 0%] Completed 'gtest' [ 8%] Built target gtest [ 8%] Built target warpctc [ 16%] Performing update step for 'CUB' [ 16%] No configure step for 'CUB' [ 16%] No build step for 'CUB' [ 16%] No install step for 'CUB' [ 16%] Completed 'CUB' [ 16%] Built target CUB [ 16%] Built target w2l-criterion-library-cuda [ 33%] Built target wav2letter-libraries [ 66%] Built target wav2letter++ [100%] Built target W2lCommonTest
Another test:
cd /root/wav2letter/build/src/tests/ && make CriterionTest
[ 0%] Performing update step for 'gtest' [ 0%] Performing configure step for 'gtest' loading initial cache file /root/wav2letter/build/src/tests/googletest/tmp/gtest-cache-Release.cmake -- Configuring done -- Generating done -- Build files have been written to: /root/wav2letter/build/src/tests/googletest/src/gtest [ 0%] Performing build step for 'gtest' [ 25%] Built target gtest [ 50%] Built target gmock [ 75%] Built target gmock_main [100%] Built target gtest_main [ 0%] No install step for 'gtest' [ 0%] Completed 'gtest' [ 8%] Built target gtest [ 8%] Built target warpctc [ 16%] Performing update step for 'CUB' [ 16%] No configure step for 'CUB' [ 16%] No build step for 'CUB' [ 16%] No install step for 'CUB' [ 16%] Completed 'CUB' [ 16%] Built target CUB [ 16%] Built target w2l-criterion-library-cuda [ 33%] Built target wav2letter-libraries [ 66%] Built target wav2letter++ [ 66%] Linking CXX executable CriterionTest [100%] Built target CriterionTest
For Seq2SeqTest:
[ 0%] Performing update step for 'gtest' [ 0%] Performing configure step for 'gtest' loading initial cache file /root/wav2letter/build/src/tests/googletest/tmp/gtest-cache-Release.cmake -- Configuring done -- Generating done -- Build files have been written to: /root/wav2letter/build/src/tests/googletest/src/gtest [ 0%] Performing build step for 'gtest' [ 25%] Built target gtest [ 50%] Built target gmock [ 75%] Built target gmock_main [100%] Built target gtest_main [ 0%] No install step for 'gtest' [ 0%] Completed 'gtest' [ 8%] Built target gtest [ 8%] Built target warpctc [ 16%] Performing update step for 'CUB' [ 16%] No configure step for 'CUB' [ 16%] No build step for 'CUB' [ 16%] No install step for 'CUB' [ 16%] Completed 'CUB' [ 16%] Built target CUB [ 16%] Built target w2l-criterion-library-cuda [ 33%] Built target wav2letter-libraries [ 66%] Built target wav2letter++ [ 66%] Linking CXX executable Seq2SeqTest [100%] Built target Seq2SeqTest
For DataTest:
[ 0%] Performing update step for 'gtest' [ 0%] Performing configure step for 'gtest' loading initial cache file /root/wav2letter/build/src/tests/googletest/tmp/gtest-cache-Release.cmake -- Configuring done -- Generating done -- Build files have been written to: /root/wav2letter/build/src/tests/googletest/src/gtest [ 0%] Performing build step for 'gtest' [ 25%] Built target gtest [ 50%] Built target gmock [ 75%] Built target gmock_main [100%] Built target gtest_main [ 0%] No install step for 'gtest' [ 0%] Completed 'gtest' [ 8%] Built target gtest [ 8%] Built target warpctc [ 16%] Performing update step for 'CUB' [ 16%] No configure step for 'CUB' [ 16%] No build step for 'CUB' [ 16%] No install step for 'CUB' [ 16%] Completed 'CUB' [ 16%] Built target CUB [ 16%] Built target w2l-criterion-library-cuda [ 33%] Built target wav2letter-libraries [ 66%] Built target wav2letter++ [100%] Built target DataTest
What is the problem?
not build, but run binary
cd /root/wav2letter/build
./src/tests/W2lCommonTest
@tlikhomanenko , Sorry for the mistake.
I have ran the binary and here is the output:
For ./src/tests/W2lCommonTest
[==========] Running 19 tests from 1 test case.
[----------] Global test environment set-up.
[----------] 19 tests from W2lCommonTest
[ RUN ] W2lCommonTest.StringTrim
[ OK ] W2lCommonTest.StringTrim (0 ms)
[ RUN ] W2lCommonTest.ReplaceAll
[ OK ] W2lCommonTest.ReplaceAll (0 ms)
[ RUN ] W2lCommonTest.StringSplit
[ OK ] W2lCommonTest.StringSplit (0 ms)
[ RUN ] W2lCommonTest.StringJoin
[ OK ] W2lCommonTest.StringJoin (0 ms)
[ RUN ] W2lCommonTest.StringFormat
[ OK ] W2lCommonTest.StringFormat (0 ms)
[ RUN ] W2lCommonTest.PathsConcat
[ OK ] W2lCommonTest.PathsConcat (0 ms)
[ RUN ] W2lCommonTest.RetryWithBackoff
[ OK ] W2lCommonTest.RetryWithBackoff (753 ms)
[ RUN ] W2lCommonTest.PackReplabels
[ OK ] W2lCommonTest.PackReplabels (0 ms)
[ RUN ] W2lCommonTest.Dictionary
[ OK ] W2lCommonTest.Dictionary (0 ms)
[ RUN ] W2lCommonTest.UnpackReplabels
[ OK ] W2lCommonTest.UnpackReplabels (0 ms)
[ RUN ] W2lCommonTest.UnpackReplabelsIgnoresInvalid
[ OK ] W2lCommonTest.UnpackReplabelsIgnoresInvalid (0 ms)
[ RUN ] W2lCommonTest.Uniq
[ OK ] W2lCommonTest.Uniq (0 ms)
[ RUN ] W2lCommonTest.Normalize
unknown file: Failure
C++ exception with description "ArrayFire Exception (Internal error:998):
In function cuda::Kernel cuda::buildKernel(int, const string&, const string&, const std::vector<std::__cxx11::basic_string
In function af::array af::stdev(const af::array&, dim_t)
In file src/api/cpp/stdev.cpp:55" thrown in the test body.
[ FAILED ] W2lCommonTest.Normalize (1070 ms)
[ RUN ] W2lCommonTest.Transpose
unknown file: Failure
C++ exception with description "ArrayFire Exception (Internal error:998):
In function cuda::Kernel cuda::buildKernel(int, const string&, const string&, const std::vector<std::__cxx11::basic_string
In function af::array af::transpose(const af::array&, bool) In file src/api/cpp/transpose.cpp:18" thrown in the test body. [ FAILED ] W2lCommonTest.Transpose (1 ms) [ RUN ] W2lCommonTest.localNormalize Segmentation fault (core dumped)
For ./src/tests/CriterionTest
[==========] Running 17 tests from 1 test case.
[----------] Global test environment set-up.
[----------] 17 tests from CriterionTest
[ RUN ] CriterionTest.CTCEmptyTarget
unknown file: Failure
C++ exception with description "ArrayFire Exception (Internal error:998):
In function cuda::Kernel cuda::buildKernel(int, const string&, const string&, const std::vector<std::__cxx11::basic_string
In function T* af::array::device() const [with T = void] In file src/api/cpp/array.cpp:941" thrown in the test body. [ FAILED ] CriterionTest.CTCEmptyTarget (1096 ms) [ RUN ] CriterionTest.CTCCost Segmentation fault (core dumped)
For ./src/tests/Seq2SeqTest
[==========] Running 10 tests from 1 test case.
[----------] Global test environment set-up.
[----------] 10 tests from Seq2SeqTest
[ RUN ] Seq2SeqTest.Seq2Seq
unknown file: Failure
C++ exception with description "ArrayFire Exception (Internal error:998):
In function cuda::Kernel cuda::buildKernel(int, const string&, const string&, const std::vector<std::__cxx11::basic_string
In function void af::array::eval() const
In file src/api/cpp/array.cpp:875" thrown in the test body.
[ FAILED ] Seq2SeqTest.Seq2Seq (1131 ms)
[ RUN ] Seq2SeqTest.Seq2SeqViterbi
unknown file: Failure
C++ exception with description "ArrayFire Exception (Internal error:998):
In function cuda::Kernel cuda::buildKernel(int, const string&, const string&, const std::vector<std::__cxx11::basic_string
In function af::array af::moddims(const af::array&, unsigned int, const dim_t*) In file src/api/cpp/data.cpp:187" thrown in the test body. [ FAILED ] Seq2SeqTest.Seq2SeqViterbi (1 ms) [ RUN ] Seq2SeqTest.Seq2SeqBeamSearchViterbi Segmentation fault (core dumped)
For .src/tests/DataTest
[==========] Running 4 tests from 2 test cases.
[----------] Global test environment set-up.
[----------] 3 tests from DataTest
[ RUN ] DataTest.inputFeaturizer
unknown file: Failure
C++ exception with description "ArrayFire Exception (Internal error:998):
In function cuda::Kernel cuda::buildKernel(int, const string&, const string&, const std::vector<std::__cxx11::basic_string
In function T af::max(const af::array&) [with T = double] In file src/api/cpp/reduce.cpp:129" thrown in the test body. [ FAILED ] DataTest.inputFeaturizer (1237 ms) [ RUN ] DataTest.targetFeaturizer [ OK ] DataTest.targetFeaturizer (2 ms) [ RUN ] DataTest.W2lListDataset WARNING: Logging before InitGoogleLogging() is written to STDERR I0930 06:10:16.147614 1499 W2lListFilesDataset.cpp:137] 3 files found. I0930 06:10:16.147682 1499 Utils.cpp:102] Filtered 0/3 samples I0930 06:10:16.147691 1499 W2lListFilesDataset.cpp:62] Total batches (i.e. iters): 3 [ OK ] DataTest.W2lListDataset (3 ms) [----------] 3 tests from DataTest (1242 ms total)
[----------] 1 test from RoundRobinBatchShufflerTest [ RUN ] RoundRobinBatchShufflerTest.params [ OK ] RoundRobinBatchShufflerTest.params (0 ms) [----------] 1 test from RoundRobinBatchShufflerTest (0 ms total)
[----------] Global test environment tear-down [==========] 4 tests from 2 test cases ran. (1242 ms total) [ PASSED ] 3 tests. [ FAILED ] 1 test, listed below: [ FAILED ] DataTest.inputFeaturizer
1 FAILED TEST
You have problem with the cuda driver probably. Which cuda is on your machine?
I am having CUDA Version: 10.1 Note that I am able to run a regular wav2letter docker image and it's working perfectly for the given tutorial on librispeech. Why lexicon_free image is showing the problem with Cuda driver? Here is the specification after running nvidia-smi command NVIDIA-SMI 418.87.00 Driver Version: 418.87.00 CUDA Version: 10.1
I think this is exactly the problem, I built it for 9.2 if remember correctly. You can try to take latest docker base image, changing it name here https://github.com/facebookresearch/wav2letter/blob/v0.2/recipes/models/lexicon_free/Dockerfile#L9 and then build image with this fixed dockerfile https://github.com/facebookresearch/wav2letter/blob/v0.2/recipes/models/lexicon_free/Dockerfile (where you change the base image) and then run all experiments in it.
Hi, sorry I didn't get your point. Both the Dockerfile in the provided link seems to have the same base image. Recently I have started with docker images and containers, so I don't have much idea about building images from docker. Could you please elaborate on the steps in a bit more detail? Thanks for the help.
Docker images can be built on top of another images. We are building at first base image where all dependencies are building (like image here https://github.com/facebookresearch/wav2letter/blob/v0.2/recipes/models/lexicon_free/Dockerfile#L9) and then we build on top of it the flashlight and the wav2letter.
What you need to do:
- create the file with content from here https://github.com/facebookresearch/wav2letter/blob/v0.2/recipes/models/lexicon_free/Dockerfile
- change the line https://github.com/facebookresearch/wav2letter/blob/v0.2/recipes/models/lexicon_free/Dockerfile#L9 to
FROM wav2letter/wav2letter:cuda-base-471f90f
- then run in the same dir as where this docker file
sudo docker build -f Dockerfile -t lexfee-new .
- then you can start container as you did before with a change
sudo docker run --runtime=nvidia --rm -itd --ipc=host --volume /home/amitm/lexfree:/root/lexfree --name lexfree lexfree-new
I have followed the steps suggested by you: Got the follwoing error at the end
Last few lines of error message
CMake Error at CMakeLists.txt:24 (find_package): Could not find a configuration file for package "ArrayFire" that is compatible with requested version "3.7.1". The following configuration files were considered but not accepted: /usr/local/share/ArrayFire/cmake/ArrayFireConfig.cmake, version: 3.6.4 Configuring incomplete, errors occurred! See also "/root/wav2letter/build/CMakeFiles/CMakeOutput.log".
Entire Erorr log file can be found here: https://gist.github.com/Trikaldarshi/fdf2c79380e7b365cc549e0ea3e0562b
Note:
I tried with other base images like cuda-base-10-latest, cuda-base-latest; but nothing worked. All the showing similar kind of error.
please add here https://github.com/facebookresearch/wav2letter/blob/v0.2/recipes/models/lexicon_free/Dockerfile#L31 checkout to the correct version of wav2letter too (right now you are building version which requires newer arrayfire) : git checkout tags/recipes-lexfree-paper
I added the following command as you suggested: cd /root/wav2letter && git checkout tags/recipes-lexfree-paper
After running the Dockerfile, I was able to build image with some intermediate warnings and errors. But again, when I executed the container withe build image, and build the test, most of them failed. Any more suggestions? what could be the cause of this?
last few lines of the log
--2020-10-04 08:57:48-- https://www.ldc.upenn.edu/sites/www.ldc.upenn.edu/files/ctools/sph2pipe_v2.5.tar.gz Resolving www.ldc.upenn.edu (www.ldc.upenn.edu)... 54.203.195.171 Connecting to www.ldc.upenn.edu (www.ldc.upenn.edu)|54.203.195.171|:443... connected. WARNING: cannot verify www.ldc.upenn.edu's certificate, issued by 'CN=InCommon RSA Server CA,OU=InCommon,O=Internet2,L=Ann Arbor,ST=MI,C=US': Issued certificate has expired. HTTP request sent, awaiting response... 200 OK Length: 329832 (322K) [application/x-gzip] Saving to: 'sph2pipe_v2.5.tar.gz'
0K .......... .......... .......... .......... .......... 15% 86.3K 3s
50K .......... .......... .......... .......... .......... 31% 175K 2s
100K .......... .......... .......... .......... .......... 46% 52.1M 1s 150K .......... .......... .......... .......... .......... 62% 77.8M 1s 200K .......... .......... .......... .......... .......... 77% 175K 0s 250K .......... .......... .......... .......... .......... 93% 65.0M 0s 300K .......... .......... .. 100% 71.3M=1.2s
2020-10-04 08:57:52 (279 KB/s) - 'sph2pipe_v2.5.tar.gz' saved [329832/329832]
file_headers.c: In function 'readSphHeader': file_headers.c:148:5: warning: format '%d' expects argument of type 'int', but argument 5 has type '__off_t {aka long int}' [-Wformat=] "Warning:%s: sample_count reset to %d to match size (%d bytes)\n", ^ file_headers.c: In function 'copyshort': file_headers.c:326:2: warning: implicit declaration of function 'swab' [-Wimplicit-function-declaration] swab((char *) &val, short_order.ch, 2 ); ^ file_headers.c: At top level: file_headers.c:579:1: warning: return type defaults to 'int' [-Wimplicit-int] ConvertToIeeeExtended(num, bytes) ^ shorten_x.c: In function 'fwrite_type': shorten_x.c:325:22: warning: implicit declaration of function 'pcm2alaw' [-Wimplicit-function-declaration] *writebufp++ = pcm2alaw( ulaw2pcm[data0[i]] ); ^ shorten_x.c:381:24: warning: implicit declaration of function 'pcm2ulaw' [-Wimplicit-function-declaration] writebufp++ = pcm2ulaw( data0[i] ); ^ shorten_x.c:464:6: warning: implicit declaration of function 'swab' [-Wimplicit-function-declaration] swab(writebuf, writefub, sizeout * nchanout * nitem); ^ sph2pipe.c: In function 'getUserOpts': sph2pipe.c:191:18: warning: implicit declaration of function 'getopt' [-Wimplicit-function-declaration] while (( i = getopt( ac, av, "daupf:c:t:s:h:" )) != EOF ) ^ sph2pipe.c: In function 'copySamples': sph2pipe.c:537:3: warning: implicit declaration of function 'swab' [-Wimplicit-function-declaration] swab( outbuf, inpbuf, nb ); / it, do byte swapping too */ ^ Removing intermediate container 9bb1b85161f1 ---> 86b658cf6ccf Successfully built 86b658cf6ccf Successfully tagged lexfee-new:latest
some intermediate errors:
-- Installing: /usr/local/share/flashlight/examples/RnnLm.cpp -- Installing: /usr/local/share/flashlight/examples/LinearRegression.cpp -- Installing: /usr/local/share/flashlight/examples/AdaptiveClassification.cpp -- Installing: /usr/local/share/flashlight/examples/Xor.cpp -- Installing: /usr/local/share/flashlight/examples/CMakeLists.txt -- Installing: /usr/local/share/flashlight/examples/README.md -- Installing: /usr/local/share/flashlight/cmake/flashlightConfig.cmake -- Boost version: 1.58.0 -- Found the following Boost libraries: -- program_options -- system -- thread -- unit_test_framework -- chrono -- date_time -- atomic -- Could NOT find Eigen3 (missing: EIGEN3_INCLUDE_DIR EIGEN3_VERSION_OK) (Required is at least version "2.91.0") CMake Warning at lm/interpolate/CMakeLists.txt:65 (message): Not building interpolation. Eigen3 was not found.
These warning are fine. Do you still have NVRTC Error(5): NVRTC_ERROR_INVALID_OPTION
for each of failed test? Then still you have problems with cuda driver.
Ohh, we have cuda 10.0, so you need to rebuild your image too probably for cuda 10.1.
Ok, simpler solution is to install cuda 10.0 also on your machine, and just link to it. Could you try this?
Yes. I am still having NVRTC Error(5): NVRTC_ERROR_INVALID_OPTION for each of the failed test. My machine is being used by multiple folks. So, I don't want to change any settings/versions/dependencies. Is there any other alternate solution? If not then I will try to rebuild my image for Cuda 10 One more doubt, why I didn't face any problem for regular wav2letter installation using docker? sudo docker run --runtime=nvidia --rm -itd --ipc=host --volume /home/amitm/w2l:/root/w2l --name w2l wav2letter/wav2letter:cuda-latest sudo docker exec -it w2l bash cd /root/wav2letter/build && make test And all test were passed
This is really weird. Can you then try to rebuild only arrayfire in the working image to have 3.6.4 and then rebuild fl and w2l with necessary commits for lexfree?
On my side will recheck the images, but try this solution, maybe it will be faster and allow you to move on.
@tlikhomanenko , sorry for a silly question, could you please tell me how to rebuild only arrayfire in the working image to have 3.6.4? After that, I can rebuild fl and w2l with necessary commits for lexfree as mentioned in https://github.com/facebookresearch/wav2letter/blob/v0.2/recipes/models/lexicon_free/Dockerfile Also, I have to build fairseq and sph2pipe as well.
see here https://github.com/facebookresearch/flashlight/blob/6ec9dc7e9f57400801794b2e2f02317031883268/Dockerfile-CUDA-Base