vosk-api icon indicating copy to clipboard operation
vosk-api copied to clipboard

Failed graph compilation

Open hrddsky opened this issue 1 year ago • 5 comments

Hello, im newbie at VOSK. Trying to compile small model with extra.dic and extra.txt Process starts fine but after few checking just stops, here is a log:

bash compile-graph.sh

utils/prepare_lang.sh data/dict [unk] data/lang_local data/lang Checking data/dict/silence_phones.txt ... --> reading data/dict/silence_phones.txt --> text seems to be UTF-8 or ASCII, checking whitespaces --> text contains only allowed whitespaces --> data/dict/silence_phones.txt is OK

Checking data/dict/optional_silence.txt ... --> reading data/dict/optional_silence.txt --> text seems to be UTF-8 or ASCII, checking whitespaces --> text contains only allowed whitespaces --> data/dict/optional_silence.txt is OK

Checking data/dict/nonsilence_phones.txt ... --> reading data/dict/nonsilence_phones.txt --> text seems to be UTF-8 or ASCII, checking whitespaces --> text contains only allowed whitespaces --> data/dict/nonsilence_phones.txt is OK

Checking disjoint: silence_phones.txt, nonsilence_phones.txt --> disjoint property is OK.

Checking data/dict/lexicon.txt --> reading data/dict/lexicon.txt --> text seems to be UTF-8 or ASCII, checking whitespaces --> text contains only allowed whitespaces --> data/dict/lexicon.txt is OK

Checking data/dict/extra_questions.txt ... --> data/dict/extra_questions.txt is empty (this is OK) --> SUCCESS [validating dictionary directory data/dict]

**Creating data/dict/lexiconp.txt from data/dict/lexicon.txt /opt/kaldi/egs/wsj/s5/../../../src/fstbin/fstaddselfloops data/lang/phones/wdisambig_phones.int data/lang/phones/wdisambig_words.int prepare_lang.sh: validating output directory utils/validate_lang.pl data/lang Checking existence of separator file separator file data/lang/subword_separator.txt is empty or does not exist, deal in word case. Checking data/lang/phones.txt ... --> text seems to be UTF-8 or ASCII, checking whitespaces --> text contains only allowed whitespaces --> data/lang/phones.txt is OK

Checking words.txt: #0 ... --> text seems to be UTF-8 or ASCII, checking whitespaces --> text contains only allowed whitespaces --> data/lang/words.txt is OK

Checking disjoint: silence.txt, nonsilence.txt, disambig.txt ... --> silence.txt and nonsilence.txt are disjoint --> silence.txt and disambig.txt are disjoint --> disambig.txt and nonsilence.txt are disjoint --> disjoint property is OK

Checking sumation: silence.txt, nonsilence.txt, disambig.txt ... --> found no unexplainable phones in phones.txt

Checking data/lang/phones/context_indep.{txt, int, csl} ... --> text seems to be UTF-8 or ASCII, checking whitespaces --> text contains only allowed whitespaces --> 10 entry/entries in data/lang/phones/context_indep.txt --> data/lang/phones/context_indep.int corresponds to data/lang/phones/context_indep.txt --> data/lang/phones/context_indep.csl corresponds to data/lang/phones/context_indep.txt --> data/lang/phones/context_indep.{txt, int, csl} are OK

Checking data/lang/phones/nonsilence.{txt, int, csl} ... --> text seems to be UTF-8 or ASCII, checking whitespaces --> text contains only allowed whitespaces --> 192 entry/entries in data/lang/phones/nonsilence.txt --> data/lang/phones/nonsilence.int corresponds to data/lang/phones/nonsilence.txt --> data/lang/phones/nonsilence.csl corresponds to data/lang/phones/nonsilence.txt --> data/lang/phones/nonsilence.{txt, int, csl} are OK

Checking data/lang/phones/silence.{txt, int, csl} ... --> text seems to be UTF-8 or ASCII, checking whitespaces --> text contains only allowed whitespaces --> 10 entry/entries in data/lang/phones/silence.txt --> data/lang/phones/silence.int corresponds to data/lang/phones/silence.txt --> data/lang/phones/silence.csl corresponds to data/lang/phones/silence.txt --> data/lang/phones/silence.{txt, int, csl} are OK

Checking data/lang/phones/optional_silence.{txt, int, csl} ... --> text seems to be UTF-8 or ASCII, checking whitespaces --> text contains only allowed whitespaces --> 1 entry/entries in data/lang/phones/optional_silence.txt --> data/lang/phones/optional_silence.int corresponds to data/lang/phones/optional_silence.txt --> data/lang/phones/optional_silence.csl corresponds to data/lang/phones/optional_silence.txt --> data/lang/phones/optional_silence.{txt, int, csl} are OK

Checking data/lang/phones/disambig.{txt, int, csl} ... --> text seems to be UTF-8 or ASCII, checking whitespaces --> text contains only allowed whitespaces --> 5 entry/entries in data/lang/phones/disambig.txt --> data/lang/phones/disambig.int corresponds to data/lang/phones/disambig.txt --> data/lang/phones/disambig.csl corresponds to data/lang/phones/disambig.txt --> data/lang/phones/disambig.{txt, int, csl} are OK

Checking data/lang/phones/roots.{txt, int} ... --> text seems to be UTF-8 or ASCII, checking whitespaces --> text contains only allowed whitespaces --> 50 entry/entries in data/lang/phones/roots.txt --> data/lang/phones/roots.int corresponds to data/lang/phones/roots.txt --> data/lang/phones/roots.{txt, int} are OK

Checking data/lang/phones/sets.{txt, int} ... --> text seems to be UTF-8 or ASCII, checking whitespaces --> text contains only allowed whitespaces --> 50 entry/entries in data/lang/phones/sets.txt --> data/lang/phones/sets.int corresponds to data/lang/phones/sets.txt --> data/lang/phones/sets.{txt, int} are OK

Checking data/lang/phones/extra_questions.{txt, int} ... --> text seems to be UTF-8 or ASCII, checking whitespaces --> text contains only allowed whitespaces --> 9 entry/entries in data/lang/phones/extra_questions.txt --> data/lang/phones/extra_questions.int corresponds to data/lang/phones/extra_questions.txt --> data/lang/phones/extra_questions.{txt, int} are OK

Checking data/lang/phones/word_boundary.{txt, int} ... --> text seems to be UTF-8 or ASCII, checking whitespaces --> text contains only allowed whitespaces --> 202 entry/entries in data/lang/phones/word_boundary.txt --> data/lang/phones/word_boundary.int corresponds to data/lang/phones/word_boundary.txt --> data/lang/phones/word_boundary.{txt, int} are OK

Checking optional_silence.txt ... --> reading data/lang/phones/optional_silence.txt --> data/lang/phones/optional_silence.txt is OK

Checking disambiguation symbols: #0 and #1 --> data/lang/phones/disambig.txt has "#0" and "#1" --> data/lang/phones/disambig.txt is OK

Checking topo ...

Checking word_boundary.txt: silence.txt, nonsilence.txt, disambig.txt ... --> data/lang/phones/word_boundary.txt doesn't include disambiguation symbols --> data/lang/phones/word_boundary.txt is the union of nonsilence.txt and silence.txt --> data/lang/phones/word_boundary.txt is OK

Checking word-level disambiguation symbols... --> data/lang/phones/wdisambig.txt exists (newer prepare_lang.sh) Checking word_boundary.int and disambig.int --> generating a 96 word/subword sequence --> resulting phone sequence from L.fst corresponds to the word sequence --> L.fst is OK --> generating a 13 word/subword sequence --> resulting phone sequence from L_disambig.fst corresponds to the word sequence --> L_disambig.fst is OK

Checking data/lang/oov.{txt, int} ... --> text seems to be UTF-8 or ASCII, checking whitespaces --> text contains only allowed whitespaces --> 1 entry/entries in data/lang/oov.txt --> data/lang/oov.int corresponds to data/lang/oov.txt --> data/lang/oov.{txt, int} are OK

--> data/lang/L.fst is olabel sorted --> data/lang/L_disambig.fst is olabel sorted --> SUCCESS [validating lang directory data/lang] utils/mkgraph_lookahead.sh : compiling grammar data/ru-mix.lm.gz utils/mkgraph_lookahead.sh : expected exp/tdnn/final.mdl to exist

What am I doing wrong? Thanks!

hrddsky avatar May 30 '23 14:05 hrddsky

You need to provide more information - command you are running and full output, not just the end. Also OS type and version

nshmyrev avatar May 30 '23 14:05 nshmyrev

Hello Nickolay, Im doing it at official kaldi docker container "kaldiasr/kaldi", MacOS 13.2.1 with M1 with turned on Rosetta emulation of amd64. My way:

  1. making all dependencies and recievieng OK at check_dependencies.sh
  2. install srilm, irstlm, opengrm, phonetisaurus
  3. make src/configure
  4. place uncompiled model small rus 0.22 to egs/wsj/s5
  5. add tools/env.sh to egs/wsj/s5/env.sh
  6. add extra.dic and extra.txt to db folder
  7. turn on compile-graph.sh
  8. waiting
  9. after that string"utils/mkgraph_lookahead.sh : expected exp/tdnn/final.mdl to exist" process just stopping

hrddsky avatar May 30 '23 14:05 hrddsky

add extra.dic and extra.txt to db folder

They should be already there, I'm not sure how you add them again. Most likely you are doing something wrong.

nshmyrev avatar May 30 '23 16:05 nshmyrev

Just replace it with filled ones with my data, trying it on Windows 10 PC at the same container - same result

hrddsky avatar May 31 '23 08:05 hrddsky

Ok and why final.mdl file is missing then, it should be there

nshmyrev avatar Jun 01 '23 08:06 nshmyrev