Srikanth Ronanki comments

Results 7 comments of


                                            Srikanth Ronanki

Merlin Voice Conversion file_id_list.scp file

As an alternative, try replacing `dtw_aligner_festvox.py` with `dtw_aligner.py` at [line number 60](https://github.com/CSTR-Edinburgh/merlin/blob/master/egs/voice_conversion/s1/03_align_src_with_target.sh#L60).

Merlin Voice Conversion file_id_list.scp file

The current setup doesn't use lf0 stats for transformation of pitch. Therefore, you can ignore this error and proceed further. However, if your features are not extracted properly, you may...

Merlin Voice Conversion file_id_list.scp file

Yeah, thank you for reminding that. Anyway, I'll remove this code -- as we're not using the stats for final F0 transformation.

acoustic_comp: the frame number of data stream lf0 is not consistent with others

Make sure the number of frames in each of lf0, bap and mgc are same. Use "x2x" in SPTK to find out the number of frames. ./tools/bin/SPTK-3.9/x2x +fa lf0/cmu_us_arctic_slt_text_01001.lf0 |...

acoustic_comp: the frame number of data stream lf0 is not consistent with others

This is the script you should use: https://github.com/CSTR-Edinburgh/merlin/blob/master/misc/scripts/vocoder/world/extract_features_for_merlin.sh Also set sampling frequency to either 16000Hz or 48000Hz w.r.t the data you are using, as the default value is 16000Hz: https://github.com/CSTR-Edinburgh/merlin/blob/master/misc/scripts/vocoder/world/extract_features_for_merlin.sh#L31...

Error during step 2 of "02_prepare_labels.sh"

If you are facing troubles with HTK, then try other alternative: - change the `Labels=state_align` to `Labels=phone_align` in `conf/global_settings.cfg` - run both steps of `02_prepare_labels.sh`

Variable length input sequence and output sequence

Is it possible to have input_shape=(None, input_dimension) and output_shape=(None, output_dimension) -- so that we can provide variable length input and get desired length as output? Is fixed length output compulsory?