Ossian icon indicating copy to clipboard operation
Ossian copied to clipboard

Using Ossian with Phone-level alignment

Open RobinAlgayres opened this issue 7 years ago • 1 comments

Hi, I am struggling to use Ossian with phone level alignment instead of state level alignment. It is easy to switch from one to the other in Merlin but in Ossian it is not so obvious. Is there a recipe I could use for that?

RobinAlgayres avatar Oct 29 '18 14:10 RobinAlgayres

Hi, I have new question that could be related to yours. In default configuration files, Ossian (via merlin interface) uses phone-level alignment for training duration model and state-level alignment for training acoustic model. On the other hand, when training acoustic and duration models directly from merlin, default values are state-level alignment in both cases. Is there any advantage of using phone-level alignment for training the duration model? In https://github.com/CSTR-Edinburgh/merlin/issues/18, Srikath Ronanki said "If you have state-alignments, you are clearly differentiating the events such as bursts in plosives, friction of breath in fricatives. Where as phone-alignments don't have such hard boundaries, and therefore it has to learn automatically from the data and model. Using sequence models such as RNNs/LSTMs may reduce the performance gap between models trained with state alignments and phone alignments."

tpolonijo avatar Apr 23 '20 12:04 tpolonijo