kapre icon indicating copy to clipboard operation
kapre copied to clipboard

Full-integer quantization and kapre layers

Open eppane opened this issue 2 years ago • 3 comments

I am training a model which includes the mel-spectrogram block from get_melspectrogram_layer() right after the input layer. Training goes well, and I am able to change the specific mel-spec-layers to their TFLite-counterparts (STFTTflite, MagnitudeTflite) afterwards. I have checked also that the model performs as well as before.

The model also perfoms as expected when converting the model to .tflite using dynamic range quantization. However, when using full-integer quantization, the model loses its accuracy (see (https://www.tensorflow.org/lite/performance/post_training_quantization#integer_only).

I suppose the mel-spec starts to significantly differ as in full-integer quantization, the input values are projected to new range (int8). Is there any way to make it work with full-integer quantization?

I guess I need to separate the mel-spec-layer from the model as a preprocessing step in order to succeed with full-integer quantization, i.e., apply the input quantization to the output values of mel-spec layer. But then I would have to deploy two models to the edge device, where the input goes first to the mel-spec-block and then to the rest of the model (?).

I am using TensorFlow 2.7.0 and kapre 0.3.7.

Here is my code for testing the tflite-model:

preds = []
# Test and evaluate the TFLite-converted model on unseen test data
for i, sample in enumerate(X_test_full_scaled):
    X = sample
    
    if input_details['dtype'] == np.int8:
        input_scale, input_zero_point = input_details["quantization"]
        X = sample / input_scale + input_zero_point
    
    X = X.reshape((1, 8000, 1)).astype(input_details["dtype"])
    
    interpreter.set_tensor(input_index, X)
    interpreter.invoke()
    pred = interpreter.get_tensor(output_index)
    
    output_scale, output_zero_point = output_details['quantization']
    if output_details['dtype'] == np.int8:
        pred = pred.astype(np.float32)
        pred = (pred - output_zero_point) * output_scale
    
    pred = np.argmax(pred, axis=1)[0]
    preds.append(pred)

preds = np.array(preds)

eppane avatar Feb 15 '22 17:02 eppane

Hi, first of all, I don’t know. I’ll guess a bit.

A direct/automatic application of full integer can be dangerous since the dynamic rance of melspectrogram magnitude (before decibel scaling) is extremely skewed. In other words, the distribution is exponential while full integer quantization would be (I think) rather linear.

keunwoochoi avatar Feb 22 '22 15:02 keunwoochoi

Hi, first of all, I don’t know. I’ll guess a bit.

A direct/automatic application of full integer can be dangerous since the dynamic rance of melspectrogram magnitude (before decibel scaling) is extremely skewed. In other words, the distribution is exponential while full integer quantization would be (I think) rather linear.

Thank you for the tip!

I decided to proceed with separating the tflite-compatible mel-spec-block from the rest of the model. So when I am applying full-integer quantization, I am using the mel-spec-block at the representative_dataset() function as a data preprocessing step. This seems to be working well.

I noticed that the size of the tflite-compatible mel-spec-block actually increases quite a bit when converting from .hdf5 to .tflite. When I save the mel-spec-block as .hdf5, it is about 12 kilobytes, but when converting to .tflite, the size is about 84 kilobytes. Is this behaviour expected?

Is it possible with kapre to calculate spectrograms iteratively, row-by-row, collecting FFT results from slices of audio at a time? Instead of needing the whole audio for calculating the STFT. I think this could be an interesting feature as in TinyML-devices, the buffers can't hold much data at once, especially when sampling rates get higher.

eppane avatar Mar 18 '22 13:03 eppane

@eppane Do you have a toy example of this? Are you performing dynamic range quantization on the melspec block and int8 quantization on the rest of the model?

Path-A avatar Jun 06 '22 14:06 Path-A