segan_pytorch why my result used your SEGAN+ generator weights can't be a human voice?

why my result used your SEGAN+ generator weights can't be a human voice?

Open LittleFlyingSheep opened this issue 5 years ago • 4 comments

Hello,I used your generator to clean the test_noisy.However,the result is so terrible that it can't be sound like a human voice,and the time of each result was different from the test data.Could you please help me repair it?

Nov 21 '19 05:11 LittleFlyingSheep

I meet same problem and the code have too many bug.

Aug 19 '20 02:08 yellowyi9527

I meet same problem and the code have too many bug.

Well, I solve this problem. Check the code which trun numpy data to wav. The sample rate of test is 48kHz not 16kHz. Best wishes.

Aug 19 '20 02:08 LittleFlyingSheep

Thanks for your help with the problem that has troubled me for several days. Thank you very much. @LittleFlyingSheep

Aug 21 '20 06:08 yellowyi9527

about the rate: modify: https://github.com/santi-pdp/segan_pytorch/blob/master/clean.py#L60-L78 to:

if not opts.h5:
            tbname = os.path.basename(twav)
            rate, wav = wavfile.read(twav)
            wav = normalize_wave_minmax(wav)
        else:
            tbname = 'tfile_{}.wav'.format(t_i)
            wav = twav
            twav = tbname
            rate = 16000
        wav = pre_emphasize(wav, args.preemph)
        pwav = torch.FloatTensor(wav).view(1,1,-1)
        if opts.cuda:
            pwav = pwav.cuda()
        g_wav, g_c = segan.generate(pwav)
        out_path = os.path.join(opts.synthesis_path,
                                tbname) 
        if opts.soundfile:
            sf.write(out_path, g_wav, rate)
        else:
            wavfile.write(out_path, rate, g_wav)

Or use ffmpeg to convert your wav file:

ffmpeg -i [MP3/Wav file] -acodec pcm_s16le -ac 1 -ar 16000 [Output wav file]

refer to: https://stackoverflow.com/a/19073622

Jun 15 '21 09:06 unanan

segan_pytorch segan_pytorch copied to clipboard

why my result used your SEGAN+ generator weights can't be a human voice?

segan_pytorch
segan_pytorch copied to clipboard