audio-diffusion-pytorch Add support to clip predicted samples to the desired range.

Add support to clip predicted samples to the desired range.

Open Kinyugo opened this issue 1 year ago • 2 comments

In diffusion it is common to want to clip samples to a desired range like [-1, 1], I think previous versions of this package supported this. However, the current implementation does not support this.

I think it would be useful to support clipping samples to a desired range.

VSampler

  def forward(..., clip_denoised: bool = False, dynamic_threshold: float = 0.0) -> Tensor:
    ...
    x_pred = alphas[i] * x_noisy - betas[i] * v_pred 
    # Add clipping support here 
    if clip_denoised:
      clip(x_pred, dynamic_threshold=dynamic_threshold)
    ...

I am happy to open a PR if this is acceptable.

Mar 07 '23 20:03 Kinyugo

Hey Kinyugo! Looks good to me. Only things is that dynamic thresholding is usually applied inside the sampling loop not only at the end. So a simple x_pred.clamp(-1,1) is probably enough -- I didn't transfer dynamic thresholding to v-diff since I'm not sure it would play well inside the sampling loop as we're not only predicting the ground truth like with normal or k-diff.

Mar 08 '23 11:03 flavioschneider

Hello Flavio. It makes sense not have dynamic thresholding. Have you experimented with the effects of clipping on the final sample quality?

Mar 23 '23 19:03 Kinyugo

audio-diffusion-pytorch audio-diffusion-pytorch copied to clipboard

Add support to clip predicted samples to the desired range.

audio-diffusion-pytorch
audio-diffusion-pytorch copied to clipboard