tensorflow-wavenet
tensorflow-wavenet copied to clipboard
Why quantize signal by this formula (signal + 1) / 2 * mu + 0.5?
In 'tensorflow-wavenet/wavenet/ops.py', the last row of the function 'mu_law_encode' displays (signal + 1) / 2 * mu + 0.5, why use this formula to quantize signal?
The signal is first companded with the mu-law (https://en.wikipedia.org/wiki/Μ-law_algorithm) and will be in range -1 to 1. The last line transforms the range to integers between 1 and 256.
def mu_law_encode(audio, quantization_channels):
'''Quantizes waveform amplitudes.'''
with tf.name_scope('encode'):
mu = tf.to_float(quantization_channels - 1)
# Perform mu-law companding transformation (ITU-T, 1988).
# Minimum operation is here to deal with rare large amplitudes caused
# by resampling.
safe_audio_abs = tf.minimum(tf.abs(audio), 1.0)
magnitude = tf.log1p(mu * safe_audio_abs) / tf.log1p(mu)
signal = tf.sign(audio) * magnitude
# Quantize signal to the specified number of levels.
return tf.to_int32((signal + 1) / 2 * mu + 0.5)
@ljuvela Thank you!
I got confused with the same statement at first.
As I can understand now:
(signal + 1) / 2
maps the range [-1, 1]
to [0,1]
And tf.to_int32( x + 0.5)
mimics the round
operation.