julius
julius copied to clipboard
How do I get a silent phoneme in the second path?
I'm trying speech recognition using an automaton, but silent phonemes (SP) are never recognized. SP phonemes appear in the first path, but not in the second path.
I tried with these commands and parameters:
# hoge.grammar
S : NS_B FRUIT_N NOISE PLEASE NS_E
FRUIT_N : FRUIT
PLEASE : DESU
# hoge.voca
% FRUIT
蜜柑 m i k a N
リンゴ r i N g o
ぶどう b u d o:
% DESU
です d e s u
% NS_B
<s> silB
% NS_E
</s> silE
% NOISE
<sp> sp
# wave file
curl https://raw.githubusercontent.com/Hiroshiba/julius/26c00a15f8391f3d7ef804203903223d2021a65f/input-16000.wav > input-16000.wav
# command
mkdfa.pl hoge
echo input-16000.wav | \
julius \
-input file \
-h segmentation-kit/models/hmmdefs_monof_mix16_gid.binhmm \
-gram hoge \
-n 10 \
and got this result:
...
### Recognition: 1st pass (LR beam)
pass1_best: <s> リンゴ <sp> です </s>
pass1_best_wordseq: 2 0 4 1 3
pass1_best_phonemeseq: silB | r i N g o | sp | d e s u | silE
pass1_best_score: -5970.041504
### Recognition: 2nd pass (RL heuristic best-first)
WARNING: 00 _default: hypothesis stack exhausted, terminate search now
STAT: 00 _default: 2 sentences have been found
STAT: 00 _default: 8 generated, 8 pushed, 10 nodes popped in 268
sentence1: <s> リンゴ です </s>
wseq1: 2 0 1 3
phseq1: silB | r i N g o | d e s u | silE
cmscore1: 1.000 0.973 1.000 1.000
score1: -5970.035156
...
and these are dfa file and dict file:
# hoge.dfa
0 3 1 0 0
1 1 2 0 0
2 4 3 0 0
3 0 4 0 0
4 2 5 0 0
5 -1 -1 1 0
# hoge.dict
0 [蜜柑] m i k a N
0 [リンゴ] r i N g o
0 [ぶどう] b u d o:
1 [です] d e s u
2 [<s>] silB
3 [</s>] silE
4 [<sp>] sp
I thought SP is easier to appear with the positive value -transp
, but SP was not appeared.
How do I get a silent phoneme in the second path?