ubisoft-laforge-daft-exprt icon indicating copy to clipboard operation
ubisoft-laforge-daft-exprt copied to clipboard

.lab and .marker files corresponding words not match for Hindi

Open shrishtis7 opened this issue 2 years ago • 0 comments

I have defined the Hindi letters (in Unicode characters) in the sympol.py file and also the phone set. While updating the marker file, I am receiving a warning " Correspondance issue between words in the .lab sentence and those in .markers file". I suppose because of this warning the following steps in the feature extraction process are giving errors. Some of the characters called dependent vowels in the Hindi language are not correctly recognised in the lab file. I have attached my symbols.py file. Please help me with this, as I require this to work in the Hindi language. I am confused about what I am missing. What changes are needed so that I can move past this error and move on to the training step? symbols.txt Error_pre_process extract_features.txt

This is the sample format of lab and marker files I have received respectively. lab file [rec-13]: उन्नीस,सौ,अड़तीस,में,नाज़ी,जर्मनी,ने,ऑस्ट्रिया,पर,कब्ज़ा,कर,लिया,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, marker file[rec-13]: marker.txt.

shrishtis7 avatar Jul 21 '23 09:07 shrishtis7