MobileNetV3 Quantised to int8 weights.
Prerequisites
Please answer the following question for yourself before submitting an issue.
- [x ] I checked to make sure that this feature has not been requested already.
1. The entire URL of the file you are using
https://github.com/tensorflow/models/blob/8bbed0227174cb6730e91cf8e922262ed2139ed3/research/slim/nets/mobilenet/README.md
2. Describe the feature you request
The README provides a mobile net v3 model quantised using uint8 weights. uint8 is deprecated by tflite-micro in favour of int8 OPs. It would be great if that is provided.
3. Additional context
I was looking to quantise the existing mobilenet_v3-small model with int8 weights. I could do that. The accuracy, however, makes it totally unusable.
The script I used is as below:
def representative_data_gen():
"""Generate calibration data with better coverage."""
num_samples = 1000 # More calibration samples
if not os.path.exists('train_images') or len(os.listdir('train_images')) < num_samples:
train_images = download_sample_images(num_samples)
for img_file in sorted(os.listdir('train_images'))[:num_samples]:
img_path = os.path.join('train_images', img_file)
img = load_img(img_path, target_size=(224, 224))
img_array = img_to_array(img)
# Generate multiple versions of each image
variants = []
# Original image
img_array = img_array.astype(np.float32)
img_normalized = img_array / 127.5 - 1
variants.append(img_normalized)
for variant in variants:
yield [np.expand_dims(variant, axis=0).astype(np.float32)]
# Load model
model = tf.keras.applications.MobileNetV3Small(
input_shape=(224, 224, 3),
include_top=True,
weights='imagenet',
include_preprocessing=False
)
# Export model in SavedModel format
print("Exporting model to SavedModel format...")
model.export('mobilenet_v3_small_saved_model')
# Convert from SavedModel format
print("Converting from SavedModel format...")
converter = tf.lite.TFLiteConverter.from_saved_model('mobilenet_v3_small_saved_model')
# Basic quantization settings
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.representative_dataset = representative_data_gen
# So, that it is easier to test against non-quantised model
converter.inference_input_type = tf.float32
converter.inference_output_type = tf.float32
print("Converting model to TFLite...")
tflite_model_quant = converter.convert()
output_path = 'mobilenet_v3_small_quantized.tflite'
with open(output_path, 'wb') as f:
f.write(tflite_model_quant)
4. Are you willing to contribute it? (Yes or No)
No. Not well versed with the intricacies.
Looking for help which will make the quantisation work for MobileNetV3 small.
I can contribute to this , if this is still needed , please assign if so, @laxmareddyp or @vikramdattu