julius icon indicating copy to clipboard operation
julius copied to clipboard

Is it possible to configure Julius in a way such that only phoneme recognition happens without tokenizing into words?

Open j-j-kam opened this issue 3 years ago • 4 comments

I'm working on an application and need to simply recognize a stream of sounds, even if they are not tokenized into words. I expect to get a list of all recognized phonemes. Is it possible with Julius?

j-j-kam avatar Jun 20 '21 14:06 j-j-kam

I think you will need the output from Julius running in server mode, that is, using the -module option. See this help page in particular -module and -outcode (phone sequence).

colbec avatar Jun 20 '21 16:06 colbec

Yes, you can do it, because I did it with HTK and Julius. Try to picture a word dictionary wich contains one phoneme for one word, like vowels.

Estalhun avatar Jun 20 '21 18:06 Estalhun

@Estalhun sounds promising. thanks. can you expand just a little bit?

j-j-kam avatar Jun 20 '21 19:06 j-j-kam

Something like this:

Sample "sententences" for training (Hungarian):

<s> T A N UU S II T V AA NY O K N A K T A R G O N<4> C A K E Z E L OEE !BREATH T A R T A L O M T AA R A KK A L !BREATH T A K SZ I S O F OEE R T E H N O L OO G I J<4> A T E H N O L OO G I A J<4> I !BREATH T E L E P H E JJ E L </s>
<s> T E R V E Z OEE A SZ T A L T OO L !BREATH T U L A J D O N UU T AA FF E L UE GY E L E T T EE R II T EE S M E N<4> T E S !BREATH T II Z M I LL I OO J<4> I T !BREATH T OE B R OEE L </s>
<s> T OE R T EE N EE S E J<4> I N E K !BREATH T UU L M E N OEE E N !BREATH V A D<2> NY U G A T O N !BREATH V A D AA SZ A TT OO L !BREATH V E H E SS E N E K !BREATH V E R S E NY<1> B E N !BREATH V E V OEE K I G !BREATH V I LL A M O S M UEE V E K V I L AA G M EE R E T UEE !BREATH V AA L A SZ I D E J E !BREATH V AA L A SZ I D OEE T </s>
<s> !BREATH V AA LL A L K O Z OO J<4> I V EE TT EE K H U SZ O N E GGY E D I K !BREATH Z AA SZ L OO S H A J OO J A K EE N<4> T AA T J AA R J A AA TT OE R EE S EE L E T UU T !BREATH EE R D E KK OE R B E OE K O SZ I SZ T EE M A OE K O SZ I SZ T EE M AA B A N OE N<3> M A G AA N A K !BREATH OE N<4> T OE R V EE NY UEE !BREATH </s>
<s> UE TY F EE L SZ O L G AA L A T O S !BREATH UE GY I N<4> T EE Z EE SS E L UE T E MM E L </s>

Dictionary:
!ANIMALS	[!ANIMALS]		!ANIMALS
!BREATH		[!BREATH]		!BREATH
!COUGH		[!COUGH]		!COUGH
!HNOISE		[!HNOISE]		!HNOISE
!MUSIC		[!MUSIC]		!MUSIC
!NOISE		[!NOISE]		!NOISE
!OOO		[!OOO]			!OOO
!PHONE		[!PHONE]		!PHONE
</s>		[]		sil
<s>		[]		sil
SENT-END        []      sil
SENT-START      []    sil
A		[A]		A
AA		[AA]		AA
E		[E]		E
EE		[EE]		EE
I		[I]		I
II		[II]		II
O		[O]		O
OO		[OO]		OO
OE		[OE]		OE
OEE		[OEE]		OEE
U		[U]		U
UU		[UU]		UU
UE		[UE]		UE
UEE		[UEE]		UEE
B		[B]		B
BB		[BB]		BB
P		[P]		P
PP		[PP]		PP
D		[D]		D
DD		[DD]		DD
D<2>		[D<2>]		D<2>
D<1>		[D<1>]		D<1>
T		[T]		T
TT		[TT]		TT
GY<1>		[GY<1>]		GY<1>
GY		[GY]		GY
GGY		[GGY]		GGY
T<2>		[T<2>]		T<2>
TY		[TY]		TY
TTY		[TTY]		TTY
G		[G]		G
GG		[GG]		GG
T<4>		[T<4>]		T<4>
K		[K]		K
KK		[KK]		KK
V		[V]		V
VV		[VV]		VV
F		[F]		F
FF		[FF]		FF
Z		[Z]		Z
ZZ		[ZZ]		ZZ
SZ		[SZ]		SZ
SSZ		[SSZ]		SSZ
SZS		[SZS]		SZS
ZS		[ZS]		ZS
ZZS		[ZZS]		ZZS
S		[S]		S
SS		[SS]		SS
H		[H]		H
HH		[HH]		HH
H<1>		[H<1>]		H<1>
J<1>		[J<1>]		J<1>
J<3>		[J<3>]		J<3>
J<4>		[J<4>]		J<4>
DZ		[DZ]		DZ
DDZ		[DDZ]		DDZ
C		[C]		C
CC		[CC]		CC
C<1>		[C<1>]		C<1>
DZS		[DZS]		DZS
DDZS		[DDZS]		DDZS
T<3>		[T<3>]		T<3>
CS		[CS]		CS
CCS		[CCS]		CCS
L		[L]		L
LL		[LL]		LL
L<1>		[L<1>]		L<1>
J		[J]		J
JJ		[JJ]		JJ
R		[R]		R
RR		[RR]		RR
R<3>		[R<3>]		R<3>
RR<3>		[RR<3>]		RR<3>
M		[M]		M
MM		[MM]		MM
M<4>		[M<4>]		M<4>
M<5>		[M<5>]		M<5>
N<3>		[N<3>]		N<3>
N		[N]		N
NN		[NN]		NN
N<4>		[N<4>]		N<4>
NY<1>		[NY<1>]		NY<1>
NY<3>		[NY<3>]		NY<3>
N<5>		[N<5>]		N<5>
NY		[NY]		NY
NNY		[NNY]		NNY

' But phoneme based (monophones) recognition - without triphones and LM - is absolutely worst ever.

Estalhun avatar Jun 21 '21 11:06 Estalhun