SpecAugment icon indicating copy to clipboard operation
SpecAugment copied to clipboard

Time Warp implementation

Open nhduong1203 opened this issue 1 year ago • 0 comments

Dear author, I have a question when look at Time Warp implementation:

def time_warp(spec, W=5):
    spec = spec.view(1, spec.shape[0], spec.shape[1])
    num_rows = spec.shape[1]
    spec_len = spec.shape[2]

    y = num_rows // 2
    horizontal_line_at_ctr = spec[0][y]
    assert len(horizontal_line_at_ctr) == spec_len

    point_to_warp = horizontal_line_at_ctr[random.randrange(W, spec_len - W)]
    assert isinstance(point_to_warp, torch.Tensor)

    # Uniform distribution from (0,W) with chance to be up to W negative
    dist_to_warp = random.randrange(-W, W)
    src_pts, dest_pts = torch.tensor([[[y, point_to_warp]]]), torch.tensor([[[y, point_to_warp + dist_to_warp]]])
    warped_spectro, dense_flows = SparseImageWarp.sparse_image_warp(spec, src_pts, dest_pts)
    return warped_spectro.squeeze(3)

As the above code, we have:

point_to_warp = spec[0][num_rows // 2][random.randrange(W, spec_len - W)]

meaning that point_to_warp is the value of the spectrogram at the point to warp.

But after that, we also have:

dest_pts = torch.tensor([[[y, point_to_warp + dist_to_warp]]]

Here, we are adding the value of a point on spectrogram with a distance, I think that's not make sense. In my opinion, it should be random.randrange(W, spec_len - W) (the position of the point to warp) instead of the value on spectrogram point_to_warp.

Sorry if I have any miss understanding. Thank you.

nhduong1203 avatar Sep 09 '24 18:09 nhduong1203