UNet-VocalSeparation-Chainer icon indicating copy to clipboard operation
UNet-VocalSeparation-Chainer copied to clipboard

Train on stereo data

Open lminer opened this issue 6 years ago • 2 comments

It would be nice to have the ability to train on stereo data. In terms of network structure, is it as simple as changing the number of input channels in this line to 2 and the number out output channels in this line to 2? Obviously training data patches would have to have 2 channels as well.

lminer avatar Oct 15 '18 21:10 lminer

Hi, sorry for replying late. Sure, I think it is possible to make it deal with stereo audio, by simply doubling this model. Also I am considering dropping the downsampling process to obtain more high-resolution results, and implement it on the demonstration web-site:)

Xiao-Ming avatar Oct 21 '18 11:10 Xiao-Ming

Nice! I have found it's more difficult to train when the model is doubled (at least in tf.keras). The error is way higher. But I haven't done much experimenting yet with optimizers/learning rates/other hyper parameters.

lminer avatar Oct 21 '18 21:10 lminer