vosk-api icon indicating copy to clipboard operation
vosk-api copied to clipboard

Wrong result decoding with test.c on Windows

Open opentld opened this issue 2 years ago • 11 comments

platform: windows10, vs2019 when running tect.c, error occurs:

LOG (VoskAPI:Model::ReadDataFiles():src\model.cc:213) Decoding params beam=13 max-active=7000 lattice-beam=6 LOG (VoskAPI:Model::ReadDataFiles():src\model.cc:216) Silence phones 1:2:3:4:5:11:12:13:14:15 Wrong parameter 6 in LAPACKE_dsyev_work Wrong parameter 1 in LAPACKE_dsygv Wrong parameter 1 in LAPACKE_dsygv ... Wrong parameter 1 in LAPACKE_dsptri_work Wrong parameter 1 in LAPACKE_dsptri_work Wrong parameter 9 in LAPACKE_dsprfs_work ... LOG (VoskAPI:kaldi::nnet3::Nnet::RemoveOrphanNodes():nnet3\nnet-nnet.cc:948) Removed 0 orphan nodes. LOG (VoskAPI:kaldi::nnet3::Nnet::RemoveOrphanComponents():nnet3\nnet-nnet.cc:847) Removing 0 orphan components. LOG (VoskAPI:kaldi::nnet3::CompileLooped():nnet3\nnet-compile-looped.cc:345) Spent 0.0998472 seconds in looped compilation. LOG (VoskAPI:Model::ReadDataFiles():src\model.cc:248) Loading i-vector extractor from model/ivector/final.ie LOG (VoskAPI:kaldi::IvectorExtractor::ComputeDerivedVars():ivector\ivector-extractor.cc:183) Computing derived variables for iVector extractor Wrong parameter 1 in LAPACKE_dsyequb_work ... ERROR (VoskAPI:kaldi::TpMatrix::Cholesky():matrix\tp-matrix.cc:110) Cholesky decomposition failed. Maybe matrix is not positive definite.

what does these mean? @proger @camillem @dremendes @Sharcoux @hviana

opentld avatar May 22 '22 03:05 opentld

Did you build libvosk yourself with mkl or you use our prebuilt binary

nshmyrev avatar May 22 '22 07:05 nshmyrev

Did you build libvosk yourself with mkl or you use our prebuilt binary

I build libvosk with Kaldi & openBlas...

opentld avatar May 22 '22 09:05 opentld

Did you build libvosk yourself with mkl or you use our prebuilt binary

I use the libvosk.dll official release for chinese speech, the results seems wrong:

LOG (VoskAPI:ReadDataFiles():model.cc:213) Decoding params beam=13 max-active=7000 lattice-beam=6 LOG (VoskAPI:ReadDataFiles():model.cc:216) Silence phones 1:2:3:4:5:6:7:8:9:10:11:12:13:14:15 LOG (VoskAPI:RemoveOrphanNodes():nnet-nnet.cc:948) Removed 0 orphan nodes. LOG (VoskAPI:RemoveOrphanComponents():nnet-nnet.cc:847) Removing 0 orphan components. LOG (VoskAPI:CompileLooped():nnet-compile-looped.cc:345) Spent 0.0994599 seconds in looped compilation. LOG (VoskAPI:ReadDataFiles():model.cc:248) Loading i-vector extractor from model/ivector/final.ie LOG (VoskAPI:ComputeDerivedVars():ivector-extractor.cc:183) Computing derived variables for iVector extractor LOG (VoskAPI:ComputeDerivedVars():ivector-extractor.cc:204) Done. LOG (VoskAPI:ReadDataFiles():model.cc:278) Loading HCLG from model/graph/HCLG.fst LOG (VoskAPI:ReadDataFiles():model.cc:293) Loading words from model/graph/words.txt LOG (VoskAPI:ReadDataFiles():model.cc:302) Loading winfo model/graph/phones/word_boundary.int LOG (VoskAPI:ReadDataFiles():model.cc:309) Loading subtract G.fst model from model/rescore/G.fst LOG (VoskAPI:ReadDataFiles():model.cc:311) Loading CARPA model from model/rescore/G.carpa vosk_recognizer_final_result: { "text" : "缁?鏄?闃虫槬 娣?浜?澶у潡 鏂囩珷 鐨?搴曡壊 鍥涙湀 鐨?婊?鏇存槸 缁?鐨?椴滄椿 绉€濯?璇楁剰 鐩庣劧" }

opentld avatar May 22 '22 09:05 opentld

Hi

Probably data is wrong. Please share the audio file you are trying so we can reproduce.

nshmyrev avatar May 24 '22 12:05 nshmyrev

https://user-images.githubusercontent.com/21096515/170046489-3413d0b6-5f8e-4c6a-9444-8c6af42e6385.mp4

Because wav format is not supported, so I named the extension wav as mp4, you can change the extension back to wav

Thank you very much !

@nshmyrev

opentld avatar May 24 '22 13:05 opentld

I get "绿 是 阳春 烟 酒 大块 文章 的 底色 四月 的 凌乱 更是 绿 的 鲜活 秀媚 诗意 盎然" for your file which is probably close. Maybe it is charset issue. Try to save to a file and open with a notepad. Encoding should be UTF-8.

nshmyrev avatar May 24 '22 13:05 nshmyrev

I use this function to decoding UTF-8 to string, it works!

std::string UTF8ToString(const std::string& utf8Data) { std::wstring_convert<std::codecvt_utf8<wchar_t>> conv; std::wstring wString = conv.from_bytes(utf8Data); // utf-8 => wstring

std::wstring_convert<std::codecvt< wchar_t, char, std::mbstate_t>>
    convert(new std::codecvt< wchar_t, char, std::mbstate_t>("CHS"));
std::string str = convert.to_bytes(wString);     // wstring => string

return str;

}

But, another question, it seems that the official libvosk.dll was not build with CUDA, would you please provide a GPU release? Thanks a lot !

@nshmyrev

opentld avatar May 24 '22 14:05 opentld

We do not support GPU on windows, it is more for linux server which needs to process hundreds of streams in parallel. You'd better use it with prebuilt docker.

nshmyrev avatar May 24 '22 14:05 nshmyrev

We do not support GPU on windows, it is more for linux server which needs to process hundreds of streams in parallel. You'd better use it with prebuilt docker.

In my experience, compiling vosk with kaldi CUDA is too difficult... :(

opentld avatar May 24 '22 15:05 opentld

We might consider building GPU packages some time in the future but no promises, sorry.

nshmyrev avatar May 24 '22 15:05 nshmyrev

We might consider building GPU packages some time in the future but no promises, sorry.

You have done so much for me, I really appreciate it! Thank you!

opentld avatar May 24 '22 15:05 opentld