torchaudio-augmentations icon indicating copy to clipboard operation
torchaudio-augmentations copied to clipboard

Shapes are still a bit confusing

Open keunwoochoi opened this issue 3 years ago • 3 comments

From ComposeMany.__call__(), is x also a ch, time shape 2-dim tensor? And I'm sure what would be the expected behavior by this function, especially the shape of the output.

keunwoochoi avatar Jun 29 '21 18:06 keunwoochoi

Yes, it's also (ch, time). The output of self.transform(x) is added to a list with an extra dimension (with unsqueeze), so that the subsequent torch.cat() will concatenate the vectors into a 3-dimensional tensor with: (batch, channels, time) https://github.com/Spijkervet/torchaudio-augmentations/blob/master/torchaudio_augmentations/compose.py#L41

Spijkervet avatar Jun 29 '21 23:06 Spijkervet

This behavior is also tested in https://github.com/Spijkervet/torchaudio-augmentations/blob/master/tests/test_compose.py#L29

Spijkervet avatar Jun 29 '21 23:06 Spijkervet

I see. How about adding it in the docstring? That would be more easily accessible for the users.

keunwoochoi avatar Jun 30 '21 03:06 keunwoochoi