DeepAFx-ST icon indicating copy to clipboard operation
DeepAFx-ST copied to clipboard

[Improvement] Increased sample rate to 44100 and added the ability to process entire files.

Open CyberLykan opened this issue 2 years ago • 2 comments

I managed to improve DeepAFx-ST. Here's what I did.

Download the zip from https://github.com/adobe-research/DeepAFx-ST and extract it.

Open Notepad++, press CTRL+SHIFT+F, find 24000, replace 44100, set the directory, replace in files.

At this point you can safely add the checkpoints and examples.

Edit scripts/process.py Replace x_44100 = torch.tensor(resampy.resample(x.view(-1).numpy(), x_sr, 44100)) with x_44100 = torch.tensor(resampy.resample(x.reshape(-1).numpy(), x_sr, 44100)) Under x_44100 = x_44100.view(1, -1) insert x_44100 = x_44100[0:1, : x_44100.shape[-1] // 2] Under x_44100 = x insert x_44100 = x_44100[0:1, : x_44100.shape[-1]] Replace r_44100 = torch.tensor(resampy.resample(r.view(-1).numpy(), r_sr, 44100)) with r_44100 = torch.tensor(resampy.resample(r.reshape(-1).numpy(), r_sr, 44100)) Under r_44100 = r_44100.view(1, -1) insert r_44100 = r_44100[0:1, : r_44100.shape[-1] // 2] Under r_44100 = r insert r_44100 = r_44100[0:1, : r_44100.shape[-1]]

Remove x_44100 = x_44100[0:1, : 44100 * 5] Remove r_44100 = r_44100[0:1, : 44100 * 5]

Replace filename = os.path.basename(args.input).replace(".wav", "") with filename = os.path.splitext(os.path.basename(args.input))[0] Remove reference = os.path.basename(args.reference).replace(".wav", "") Replace out_filepath = os.path.join(dirname, f"{filename}_out_ref={reference}.wav") with out_filepath = os.path.join(dirname, f"{filename}_DeepAFx-ST.wav") Remove in_filepath = os.path.join(dirname, f"{filename}_in.wav") Remove torchaudio.save(in_filepath, x_44100.cpu().view(1, -1), 44100)

You should be good to go!

It's possible that this approach may have broken some things not related to processing.

CyberLykan avatar Aug 01 '22 01:08 CyberLykan

#24 #22

CyberLykan avatar Aug 01 '22 01:08 CyberLykan

Haw can I get the parameters of EQ and Compressor?

selyu504 avatar Aug 12 '22 08:08 selyu504

Haw can I get the parameters of EQ and Compressor?

Please use the other issue you made for discussion. Your comment does not fit here.

CyberLykan avatar Aug 16 '22 04:08 CyberLykan

If your results are getting cut in half or doubled, try experimenting with removing or adding // 2 from both lines.

CyberLykan avatar Sep 01 '22 03:09 CyberLykan

Seems like there are still a lot of issues with this approach. :/

CyberLykan avatar Jan 09 '23 01:01 CyberLykan

LibriTTS dataset is only at 24 kHz so you would need to find a new dataset to re-train at 44k

kelseyjd avatar Jun 26 '23 22:06 kelseyjd