audio
audio copied to clipboard
Inconsistent behavior of mu-law encode/decode
The docstrings of mu_law_encoding and mu_law_decoding say the following.
https://github.com/pytorch/audio/blob/a8fbbdac1de389cf7799b0abd0684f9b77c2f675/torchaudio/functional/functional.py#L491-L492
And at the beginning of the operation, it performs conversion for non-real-float Tensor into real float.
https://github.com/pytorch/audio/blob/a8fbbdac1de389cf7799b0abd0684f9b77c2f675/torchaudio/functional/functional.py#L502-L503
Note: is_floating_point() return True for half, float and double types, and returns False for integer types and complex types, such as cfloat and cdouble.
If an integer Tensor is passed, then it will be converted to float, but values are retained, so converting to floating point does not help. We should not perform an implicit conversion that still contradicts with the assumption described in docstring and does not align with the expected algorithm.
Suggestion
1. Change the docstring
- This algorithm assumes the signal has been scaled to between -1 and 1 and
returns a signal encoded with values from 0 to quantization_channels - 1.
+ This algorithm expects the signal has been scaled to between -1 and 1 and
returns a signal encoded with values from 0 to quantization_channels - 1.
2. Reject if the input is not real floating type
- if not x.is_floating_point():
- x = x.to(torch.float)
+ assert x.is_floating_point(), "The input Tensor must be of real float type."
Of course, with one release cycle of warning.
if not x.is_floating_point():
+ warnings.warn("The input Tensor must be of floating type. This will be an error in the XXX release.")
x = x.to(torch.float)
I am happy to look at this issue
@mthrok just a reminder that this needs to be deprecated now that we have the RC. I am pinging you to see if it's on your radar - I saw that the other one I work on is: https://github.com/pytorch/audio/issues/1883