magphase icon indicating copy to clipboard operation
magphase copied to clipboard

Adding magphase to Merlin configuration.py, output dims?

Open dreamk73 opened this issue 8 years ago • 7 comments

In the script that extracts features for magphase, it says typically it extracts 60 mag, 45 real, and 45 imag features. I am using 48kHz audio, just like in the script. So are those numbers correct then? I wonder if there are delta or delta-delta features extracted as well? What should I put in configuration.py as the output dimension for these features?

dreamk73 avatar Oct 27 '17 09:10 dreamk73

Hi, You need to add the deltas and delta deltas for each feature. So:

lf0: 1 dlf0: 3 mag: 60 dmag: 180 real: 45 dreal: 135 imag: 45 dimag: 135

By the way, the complete MagPhase-Merlin integration is coming soon. For now, you can follow the instructions in the MagPhase repo, and that should work.

felipeespic avatar Oct 29 '17 01:10 felipeespic

Thanks. I appreciate you integrating it with Merlin. But I would like to work on trying it out now, if possible. I have worked on integrating other vocoders before, and it is doable.

Thanks to your info, I am now able to train acoustic models. But when I try to generate the waveforms with the script provided in demos/demo_run_for_merlin, I get this error:

ValueError: Dimension provided not compatible with file size.

I use 48000 Hz data and the dimensions to all the features are set as you said. Could it be the framelength feature? Or something else?

dreamk73 avatar Oct 30 '17 10:10 dreamk73

Hi, which script and line is throwing the error?

felipeespic avatar Nov 02 '17 17:11 felipeespic

When trying to synthesize running 2_batch_wave_generation.py. Here is the full trace: Traceback (most recent call last): File "2_batch_wave_generation.py", line 70, in

Generating wavefile: sn008_sent152................................ lu.run_multithreaded(synthesis, in_feats_dir, l_file_tokns, out_syn_dir, nbins_mel, nbins_phase, mvf, fs, fft_len, b_postfilter) File "/home/esther/merlin/tools/bin/magphase/src/libutils.py", line 61, in run_multithreaded

Generating wavefile: sn008_sent156................................ Generating wavefile: sn008_sent154................................ results = pool.map(func_wrapper, l_iterable_args) File "/usr/local/anaconda/lib/python2.7/multiprocessing/pool.py", line 251, in map

Generating wavefile: sn008_sent158................................ return self.map_async(func, iterable, chunksize).get() File "/usr/local/anaconda/lib/python2.7/multiprocessing/pool.py", line 567, in get

Generating wavefile: sn008_sent160................................

Generating wavefile: sn008_sent162................................

Generating wavefile: sn008_sent164................................

Generating wavefile: sn008_sent166................................

Generating wavefile: sn008_sent168................................ raise self._value ValueError: Dimension provided not compatible with file size.

Generating wavefile: sn008_sent170................................

dreamk73 avatar Nov 03 '17 08:11 dreamk73

It seems that the files that you generated (.mag, .real, or .imag) have a wrong dimension. If you do not know how to check that, you can send some samples to me, so I can check it.

felipeespic avatar Nov 03 '17 14:11 felipeespic

hi, when I run the code with my own data(child voice data),it always get the bug"magphase.py:352: RuntimeWarning: invalid value encountered in divide", is there anything I did wrong?

KnowBetterHelps avatar Nov 13 '17 09:11 KnowBetterHelps

Hi @hyuezhi ,

Just to let you know, if you want to post any new issue or question in GitHub, you need to do it by creating a "New Issue" (button in the top right of the page), not by commenting an issue that is not related with yours :) I created the "new issue" for you.

felipeespic avatar Nov 14 '17 23:11 felipeespic