asteroid icon indicating copy to clipboard operation
asteroid copied to clipboard

fix_length_mode="trim" can lead to confusing loss/metrics

Open jonashaag opened this issue 4 years ago • 3 comments

When using fix_length_mode="trim" in DCUNet, the signal is right-trimmed to the next possible size, and then zero-padded on the right to reconstruct the original size. When using a loss or metric that does not know about the actual input size to the model, the loss/metric values can be way off, since essentially a chunk of the expected signal is zeroed in the estimate. (I don't think it hurts training though.)

What to do? Maybe just add note to the docs?

jonashaag avatar Jan 03 '21 11:01 jonashaag

A note in the docs seems good.

Also maybe exposing a function to compute the length would make sense so that the metric can be computed on the non-zero part of the signal?

mpariente avatar Jan 04 '21 09:01 mpariente

I did a patch that adds the doc hint and the function. I realised the function may be generally useful, even to non-STFT models, to return the reconstructed size (without the pad_x_to_y reconstruction done in BaseEncoderMaskerDecoder.forward()). Do you think we should move it to BaseEncoderMaskerDecoder, and also move the docs hint there?

jonashaag avatar Jan 07 '21 21:01 jonashaag

Can you open a PR with this patch please? And we'll move the discussion there?

mpariente avatar Jan 08 '21 08:01 mpariente