tensorflow-wavenet icon indicating copy to clipboard operation
tensorflow-wavenet copied to clipboard

Why quantize signal by this formula (signal + 1) / 2 * mu + 0.5?

Open littleTwelve opened this issue 7 years ago • 3 comments

In 'tensorflow-wavenet/wavenet/ops.py', the last row of the function 'mu_law_encode' displays (signal + 1) / 2 * mu + 0.5, why use this formula to quantize signal?

littleTwelve avatar Dec 11 '17 05:12 littleTwelve

The signal is first companded with the mu-law (https://en.wikipedia.org/wiki/Μ-law_algorithm) and will be in range -1 to 1. The last line transforms the range to integers between 1 and 256.

def mu_law_encode(audio, quantization_channels):
    '''Quantizes waveform amplitudes.'''
    with tf.name_scope('encode'):
        mu = tf.to_float(quantization_channels - 1)
        # Perform mu-law companding transformation (ITU-T, 1988).
        # Minimum operation is here to deal with rare large amplitudes caused
        # by resampling.
        safe_audio_abs = tf.minimum(tf.abs(audio), 1.0)
        magnitude = tf.log1p(mu * safe_audio_abs) / tf.log1p(mu)
        signal = tf.sign(audio) * magnitude
        # Quantize signal to the specified number of levels.
        return tf.to_int32((signal + 1) / 2 * mu + 0.5)

ljuvela avatar Dec 11 '17 06:12 ljuvela

@ljuvela Thank you!

littleTwelve avatar Dec 11 '17 07:12 littleTwelve

I got confused with the same statement at first.

As I can understand now: (signal + 1) / 2 maps the range [-1, 1] to [0,1] And tf.to_int32( x + 0.5) mimics the round operation.

loveychen avatar Jul 05 '18 06:07 loveychen