vq-vae-2-pytorch icon indicating copy to clipboard operation
vq-vae-2-pytorch copied to clipboard

Cannot reconstruct when use mel spectrum as data

Open LiNaihan opened this issue 5 years ago • 1 comments
trafficstars

I leveraged the code and setting, with the only change that I employed conv1d to process mel spectrums, which can be considered as 1-d data with 80 channels. However, I found the reconstruction is quite poor, converging to a large loss. Is there any guess for the reason or suggestion for debugging? Thanks a lot!

LiNaihan avatar Nov 22 '19 13:11 LiNaihan

Sorry for late reply. I haven't tried this model on audio domain, but I suspect that data normalization and preprocessings are crucial for log melspectrogram as this model doesn't have particular normalization layers.

rosinality avatar Dec 10 '19 01:12 rosinality