Question about the data preparation
Hi, great work!
I find that the Appendix B Data Preparation part is a little bit confused. It first says that using Dnsmos and forced alignment to filter out some data. But the last sentence says that you did not filter any segments based on these filters.
I'm wondering that whether you use dnsmos and forced alignment as filters.
Best regards.
@xiami2019 Thanks for the question. We use DNSMOS and force alignment to filter out file level data. In each file, we do not filter out any segments to keep the context information.
@xiami2019 Thanks for the question. We use DNSMOS and force alignment to filter out file level data. In each file, we do not filter out any segments to keep the context information.
May I ask which specific model or toolkit was used for the forced alignment?
@rulerman We build a Force alignment model using kaldi nnet3 scripte with around 10k hours data