speech-aligner
speech-aligner copied to clipboard
Did not successfully decode file BAC009S0002W0125, len = 629
这是个什么问题?
同问,也遇到这么个问题
我也是遇到这个问题
@HW140701 对于这个问题,我在montreal-forced-aligner中看到过一个解决办法:逐渐调大beam的值,直至合适为止,可以得到textgrid文件
I install the repo successful, but I meet the error as follows. when use it. Do you know how to solve it?
/bin/speech-aligner --config=egs/cn_phn/conf/align.conf egs/cn_phn/data/wav.scp egs/cn_phn/data/text egs/cn_phn/data/out.ali ERROR (speech-aligner[5.4.215~4-f2b7]:Input():util/kaldi-io.cc:756) Error opening input stream res/tree
[ Stack-Trace: ] kaldi::MessageLogger::HandleMessage(kaldi::LogMessageEnvelope const&, char const*) kaldi::MessageLogger::~MessageLogger() kaldi::Input::Input(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, bool*) main __libc_start_main _start
My setting : ubuntu16.04 cmake 3.9.1
@HW140701 对于这个问题,我在montreal-forced-aligner中看到过一个解决办法:逐渐调大beam的值,直至合适为止,可以得到textgrid文件
This works for me. Experiment as follows:
- merge two sample files with ffmpeg
ffmpeg -i BAC009S0002W0122.wav -i BAC009S0002W0123.wav -filter_complex '[0:0][1:0]concat=n=2:v=0:a=1[out]' -map '[out]' merged.wav
- create a new playlist called merged.lst with content:
merged merged.wav
-
also create a merged transcript called merged.txt
-
in run.sh, execute the following script
speech-aligner --config=conf/align.conf merged.lst merged.txt merged.out
(this should fail)
- now edit align.conf, set:
--beam=40 --retry-beam=80
(now it should work)
I also tested another audio file of 49 seconds. In order to finish align, the beam parameter has to be increased to 10240, and it runs much slower.
I guess that's why input audio must be a play list. By design the aligner is intended to process a list of sentences, each in a separate audio file, in which case a beam of 20 or 40 should be enough.