keras-cv StableDiffusion Tensorflow to TF Lite

Hi @LukeWood,

For fun, I tried converting stable Diffusion model from Tensorflow to TF lite, so that I can run it on coral/edge tpu.

I tried two approaches: I- Saved model approach: II- Go through h5

will try to document them as much as possible. (sorry in advance for the long traces)

for both:

!pip install git+https://github.com/divamgupta/stable-diffusion-tensorflow --upgrade 
!pip install tensorflow tensorflow_addons ftfy --upgrade

Using !pip install --upgrade keras-cv I was not able to save the model for both.

I- Saved model approach:

Saved the model in a directory

from stable_diffusion_tf.stable_diffusion import StableDiffusion
model = StableDiffusion(
    img_height=512,
    img_width=512,
)
model.diffusion_model.save('/saved_model')

lets try to load it:

import tensorflow as tf
model2 = tf.keras.models.load_model('/saved_model')
converter = tf.lite.TFLiteConverter.from_keras_model(model2)
tflite_model = converter.convert()
open("converted_model.tflite", "wb").write(tflite_model)

The following error is thrown:

WARNING:tensorflow:No training configuration found in save file, so the model was *not* compiled. Compile it manually.
WARNING:absl:Found untraced functions such as dense_328_layer_call_fn, dense_328_layer_call_and_return_conditional_losses, dense_329_layer_call_fn, dense_329_layer_call_and_return_conditional_losses, group_normalization_173_layer_call_fn while saving (showing 5 of 1200). These functions will not be directly callable after loading.
INFO:tensorflow:Assets written to: /tmp/tmpicdc9dmk/assets
INFO:tensorflow:Assets written to: /tmp/tmpicdc9dmk/assets
---------------------------------------------------------------------------
ConverterError                            Traceback (most recent call last)
<ipython-input-15-52d23a3e5390> in <module>
      2 model2 = tf.keras.models.load_model('mydata/ivo/pythalpha/saved_model')
      3 converter = tf.lite.TFLiteConverter.from_keras_model(model2)
----> 4 tflite_model = converter.convert()
      5 open("converted_model.tflite", "wb").write(tflite_model)

/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/lite/python/lite.py in wrapper(self, *args, **kwargs)
    931   def wrapper(self, *args, **kwargs):
    932     # pylint: disable=protected-access
--> 933     return self._convert_and_export_metrics(convert_func, *args, **kwargs)
    934     # pylint: enable=protected-access
    935 

/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/lite/python/lite.py in _convert_and_export_metrics(self, convert_func, *args, **kwargs)
    909     self._save_conversion_params_metric()
    910     start_time = time.process_time()
--> 911     result = convert_func(self, *args, **kwargs)
    912     elapsed_time_ms = (time.process_time() - start_time) * 1000
    913     if result:

/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/lite/python/lite.py in convert(self)
   1340         Invalid quantization parameters.
   1341     """
-> 1342     saved_model_convert_result = self._convert_as_saved_model()
   1343     if saved_model_convert_result:
   1344       return saved_model_convert_result

/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/lite/python/lite.py in _convert_as_saved_model(self)
   1322           self._convert_keras_to_saved_model(temp_dir))
   1323       if self.saved_model_dir:
-> 1324         return super(TFLiteKerasModelConverterV2,
   1325                      self).convert(graph_def, input_tensors, output_tensors)
   1326     finally:

/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/lite/python/lite.py in convert(self, graph_def, input_tensors, output_tensors)
   1133 
   1134     # Converts model.
-> 1135     result = _convert_graphdef(
   1136         input_data=graph_def,
   1137         input_tensors=input_tensors,

/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/lite/python/convert_phase.py in wrapper(*args, **kwargs)
    210         else:
    211           report_error_message(str(converter_error))
--> 212         raise converter_error from None  # Re-throws the exception.
    213       except Exception as error:
    214         report_error_message(str(error))

/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/lite/python/convert_phase.py in wrapper(*args, **kwargs)
    203     def wrapper(*args, **kwargs):
    204       try:
--> 205         return func(*args, **kwargs)
    206       except ConverterError as converter_error:
    207         if converter_error.errors:

/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/lite/python/convert.py in convert_graphdef(input_data, input_tensors, output_tensors, **kwargs)
    791       model_flags.output_arrays.append(util.get_tensor_name(output_tensor))
    792 
--> 793   data = convert(
    794       model_flags.SerializeToString(),
    795       conversion_flags.SerializeToString(),

/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/lite/python/convert.py in convert(model_flags_str, conversion_flags_str, input_data_str, debug_info_str, enable_mlir_converter)
    308       for error_data in _metrics_wrapper.retrieve_collected_errors():
    309         converter_error.append_error(error_data)
--> 310       raise converter_error
    311 
    312   return _run_deprecated_conversion_binary(model_flags_str,

ConverterError: /opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: error: 'tf.Conv2D' op is neither a custom op nor a flex op
<unknown>:0: note: loc(fused["StatefulPartitionedCall:", "StatefulPartitionedCall"]): called from
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: note: Error code: ERROR_NEEDS_FLEX_OPS
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: error: 'tf.Conv2D' op is neither a custom op nor a flex op
<unknown>:0: note: loc(fused["StatefulPartitionedCall:", "StatefulPartitionedCall"]): called from
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: note: Error code: ERROR_NEEDS_FLEX_OPS
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: error: 'tf.Conv2D' op is neither a custom op nor a flex op
<unknown>:0: note: loc(fused["StatefulPartitionedCall:", "StatefulPartitionedCall"]): called from
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: note: Error code: ERROR_NEEDS_FLEX_OPS
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: error: 'tf.Conv2D' op is neither a custom op nor a flex op
<unknown>:0: note: loc(fused["StatefulPartitionedCall:", "StatefulPartitionedCall"]): called from
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: note: Error code: ERROR_NEEDS_FLEX_OPS
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: error: 'tf.Conv2D' op is neither a custom op nor a flex op
<unknown>:0: note: loc(fused["StatefulPartitionedCall:", "StatefulPartitionedCall"]): called from
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: note: Error code: ERROR_NEEDS_FLEX_OPS
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: error: 'tf.Conv2D' op is neither a custom op nor a flex op
<unknown>:0: note: loc(fused["StatefulPartitionedCall:", "StatefulPartitionedCall"]): called from
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: note: Error code: ERROR_NEEDS_FLEX_OPS
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: error: 'tf.Conv2D' op is neither a custom op nor a flex op
<unknown>:0: note: loc(fused["StatefulPartitionedCall:", "StatefulPartitionedCall"]): called from
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: note: Error code: ERROR_NEEDS_FLEX_OPS
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: error: 'tf.Conv2D' op is neither a custom op nor a flex op
<unknown>:0: note: loc(fused["StatefulPartitionedCall:", "StatefulPartitionedCall"]): called from
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: note: Error code: ERROR_NEEDS_FLEX_OPS
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: error: 'tf.Conv2D' op is neither a custom op nor a flex op
<unknown>:0: note: loc(fused["StatefulPartitionedCall:", "StatefulPartitionedCall"]): called from
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: note: Error code: ERROR_NEEDS_FLEX_OPS
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: error: 'tf.Conv2D' op is neither a custom op nor a flex op
<unknown>:0: note: loc(fused["StatefulPartitionedCall:", "StatefulPartitionedCall"]): called from
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: note: Error code: ERROR_NEEDS_FLEX_OPS
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: error: 'tf.Conv2D' op is neither a custom op nor a flex op
<unknown>:0: note: loc(fused["StatefulPartitionedCall:", "StatefulPartitionedCall"]): called from
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: note: Error code: ERROR_NEEDS_FLEX_OPS
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: error: 'tf.Conv2D' op is neither a custom op nor a flex op
<unknown>:0: note: loc(fused["StatefulPartitionedCall:", "StatefulPartitionedCall"]): called from
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: note: Error code: ERROR_NEEDS_FLEX_OPS
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: error: 'tf.Conv2D' op is neither a custom op nor a flex op
<unknown>:0: note: loc(fused["StatefulPartitionedCall:", "StatefulPartitionedCall"]): called from
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: note: Error code: ERROR_NEEDS_FLEX_OPS
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: error: 'tf.Conv2D' op is neither a custom op nor a flex op
<unknown>:0: note: loc(fused["StatefulPartitionedCall:", "StatefulPartitionedCall"]): called from
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: note: Error code: ERROR_NEEDS_FLEX_OPS
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: error: 'tf.Conv2D' op is neither a custom op nor a flex op
<unknown>:0: note: loc(fused["StatefulPartitionedCall:", "StatefulPartitionedCall"]): called from
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: note: Error code: ERROR_NEEDS_FLEX_OPS
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: error: 'tf.Conv2D' op is neither a custom op nor a flex op
<unknown>:0: note: loc(fused["StatefulPartitionedCall:", "StatefulPartitionedCall"]): called from
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: note: Error code: ERROR_NEEDS_FLEX_OPS
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: error: 'tf.Conv2D' op is neither a custom op nor a flex op
<unknown>:0: note: loc(fused["StatefulPartitionedCall:", "StatefulPartitionedCall"]): called from
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: note: Error code: ERROR_NEEDS_FLEX_OPS
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: error: 'tf.Conv2D' op is neither a custom op nor a flex op
<unknown>:0: note: loc(fused["StatefulPartitionedCall:", "StatefulPartitionedCall"]): called from
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: note: Error code: ERROR_NEEDS_FLEX_OPS
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: error: 'tf.Conv2D' op is neither a custom op nor a flex op
<unknown>:0: note: loc(fused["StatefulPartitionedCall:", "StatefulPartitionedCall"]): called from
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: note: Error code: ERROR_NEEDS_FLEX_OPS
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: error: 'tf.Conv2D' op is neither a custom op nor a flex op
<unknown>:0: note: loc(fused["StatefulPartitionedCall:", "StatefulPartitionedCall"]): called from
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: note: Error code: ERROR_NEEDS_FLEX_OPS
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: error: 'tf.Conv2D' op is neither a custom op nor a flex op
<unknown>:0: note: loc(fused["StatefulPartitionedCall:", "StatefulPartitionedCall"]): called from
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: note: Error code: ERROR_NEEDS_FLEX_OPS
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: error: 'tf.Conv2D' op is neither a custom op nor a flex op
<unknown>:0: note: loc(fused["StatefulPartitionedCall:", "StatefulPartitionedCall"]): called from
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: note: Error code: ERROR_NEEDS_FLEX_OPS
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: error: 'tf.Conv2D' op is neither a custom op nor a flex op
<unknown>:0: note: loc(fused["StatefulPartitionedCall:", "StatefulPartitionedCall"]): called from
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: note: Error code: ERROR_NEEDS_FLEX_OPS
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: error: 'tf.Conv2D' op is neither a custom op nor a flex op
<unknown>:0: note: loc(fused["StatefulPartitionedCall:", "StatefulPartitionedCall"]): called from
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: note: Error code: ERROR_NEEDS_FLEX_OPS
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: error: 'tf.Conv2D' op is neither a custom op nor a flex op
<unknown>:0: note: loc(fused["StatefulPartitionedCall:", "StatefulPartitionedCall"]): called from
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: note: Error code: ERROR_NEEDS_FLEX_OPS
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: error: 'tf.Conv2D' op is neither a custom op nor a flex op
<unknown>:0: note: loc(fused["StatefulPartitionedCall:", "StatefulPartitionedCall"]): called from
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: note: Error code: ERROR_NEEDS_FLEX_OPS
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: error: 'tf.Conv2D' op is neither a custom op nor a flex op
<unknown>:0: note: loc(fused["StatefulPartitionedCall:", "StatefulPartitionedCall"]): called from
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: note: Error code: ERROR_NEEDS_FLEX_OPS
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: error: 'tf.Conv2D' op is neither a custom op nor a flex op
<unknown>:0: note: loc(fused["StatefulPartitionedCall:", "StatefulPartitionedCall"]): called from
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: note: Error code: ERROR_NEEDS_FLEX_OPS
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: error: 'tf.Conv2D' op is neither a custom op nor a flex op
<unknown>:0: note: loc(fused["StatefulPartitionedCall:", "StatefulPartitionedCall"]): called from
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: note: Error code: ERROR_NEEDS_FLEX_OPS
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: error: 'tf.Conv2D' op is neither a custom op nor a flex op
<unknown>:0: note: loc(fused["StatefulPartitionedCall:", "StatefulPartitionedCall"]): called from
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: note: Error code: ERROR_NEEDS_FLEX_OPS
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: error: 'tf.Conv2D' op is neither a custom op nor a flex op
<unknown>:0: note: loc(fused["StatefulPartitionedCall:", "StatefulPartitionedCall"]): called from
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: note: Error code: ERROR_NEEDS_FLEX_OPS
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: error: 'tf.Conv2D' op is neither a custom op nor a flex op
<unknown>:0: note: loc(fused["StatefulPartitionedCall:", "StatefulPartitionedCall"]): called from
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: note: Error code: ERROR_NEEDS_FLEX_OPS
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: error: 'tf.Conv2D' op is neither a custom op nor a flex op
<unknown>:0: note: loc(fused["StatefulPartitionedCall:", "StatefulPartitionedCall"]): called from
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: note: Error code: ERROR_NEEDS_FLEX_OPS
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: error: 'tf.Conv2D' op is neither a custom op nor a flex op
<unknown>:0: note: loc(fused["StatefulPartitionedCall:", "StatefulPartitionedCall"]): called from
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: note: Error code: ERROR_NEEDS_FLEX_OPS
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: error: 'tf.Conv2D' op is neither a custom op nor a flex op
<unknown>:0: note: loc(fused["StatefulPartitionedCall:", "StatefulPartitionedCall"]): called from
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: note: Error code: ERROR_NEEDS_FLEX_OPS
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: error: 'tf.Conv2D' op is neither a custom op nor a flex op
<unknown>:0: note: loc(fused["StatefulPartitionedCall:", "StatefulPartitionedCall"]): called from
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: note: Error code: ERROR_NEEDS_FLEX_OPS
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: error: 'tf.Conv2D' op is neither a custom op nor a flex op
<unknown>:0: note: loc(fused["StatefulPartitionedCall:", "StatefulPartitionedCall"]): called from
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: note: Error code: ERROR_NEEDS_FLEX_OPS
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: error: 'tf.Conv2D' op is neither a custom op nor a flex op
<unknown>:0: note: loc(fused["StatefulPartitionedCall:", "StatefulPartitionedCall"]): called from
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: note: Error code: ERROR_NEEDS_FLEX_OPS
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: error: 'tf.Conv2D' op is neither a custom op nor a flex op
<unknown>:0: note: loc(fused["StatefulPartitionedCall:", "StatefulPartitionedCall"]): called from
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: note: Error code: ERROR_NEEDS_FLEX_OPS
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: error: 'tf.Conv2D' op is neither a custom op nor a flex op
<unknown>:0: note: loc(fused["StatefulPartitionedCall:", "StatefulPartitionedCall"]): called from
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: note: Error code: ERROR_NEEDS_FLEX_OPS
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: error: 'tf.Conv2D' op is neither a custom op nor a flex op
<unknown>:0: note: loc(fused["StatefulPartitionedCall:", "StatefulPartitionedCall"]): called from
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: note: Error code: ERROR_NEEDS_FLEX_OPS
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: error: 'tf.Conv2D' op is neither a custom op nor a flex op
<unknown>:0: note: loc(fused["StatefulPartitionedCall:", "StatefulPartitionedCall"]): called from
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: note: Error code: ERROR_NEEDS_FLEX_OPS
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: error: 'tf.Conv2D' op is neither a custom op nor a flex op
<unknown>:0: note: loc(fused["StatefulPartitionedCall:", "StatefulPartitionedCall"]): called from
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: note: Error code: ERROR_NEEDS_FLEX_OPS
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: error: 'tf.Conv2D' op is neither a custom op nor a flex op
<unknown>:0: note: loc(fused["StatefulPartitionedCall:", "StatefulPartitionedCall"]): called from
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: note: Error code: ERROR_NEEDS_FLEX_OPS
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: error: 'tf.Conv2D' op is neither a custom op nor a flex op
<unknown>:0: note: loc(fused["StatefulPartitionedCall:", "StatefulPartitionedCall"]): called from
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: note: Error code: ERROR_NEEDS_FLEX_OPS
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: error: 'tf.Conv2D' op is neither a custom op nor a flex op
<unknown>:0: note: loc(fused["StatefulPartitionedCall:", "StatefulPartitionedCall"]): called from
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: note: Error code: ERROR_NEEDS_FLEX_OPS
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: error: 'tf.Conv2D' op is neither a custom op nor a flex op
<unknown>:0: note: loc(fused["StatefulPartitionedCall:", "StatefulPartitionedCall"]): called from
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: note: Error code: ERROR_NEEDS_FLEX_OPS
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: error: 'tf.Conv2D' op is neither a custom op nor a flex op
<unknown>:0: note: loc(fused["StatefulPartitionedCall:", "StatefulPartitionedCall"]): called from
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: note: Error code: ERROR_NEEDS_FLEX_OPS
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: error: 'tf.Conv2D' op is neither a custom op nor a flex op
<unknown>:0: note: loc(fused["StatefulPartitionedCall:", "StatefulPartitionedCall"]): called from
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: note: Error code: ERROR_NEEDS_FLEX_OPS
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: error: 'tf.Conv2D' op is neither a custom op nor a flex op
<unknown>:0: note: loc(fused["StatefulPartitionedCall:", "StatefulPartitionedCall"]): called from
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: note: Error code: ERROR_NEEDS_FLEX_OPS
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: error: 'tf.Conv2D' op is neither a custom op nor a flex op
<unknown>:0: note: loc(fused["StatefulPartitionedCall:", "StatefulPartitionedCall"]): called from
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: note: Error code: ERROR_NEEDS_FLEX_OPS
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: error: 'tf.Conv2D' op is neither a custom op nor a flex op
<unknown>:0: note: loc(fused["StatefulPartitionedCall:", "StatefulPartitionedCall"]): called from
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: note: Error code: ERROR_NEEDS_FLEX_OPS
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: error: 'tf.Conv2D' op is neither a custom op nor a flex op
<unknown>:0: note: loc(fused["StatefulPartitionedCall:", "StatefulPartitionedCall"]): called from
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: note: Error code: ERROR_NEEDS_FLEX_OPS
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: error: 'tf.Conv2D' op is neither a custom op nor a flex op
<unknown>:0: note: loc(fused["StatefulPartitionedCall:", "StatefulPartitionedCall"]): called from
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: note: Error code: ERROR_NEEDS_FLEX_OPS
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: error: 'tf.Conv2D' op is neither a custom op nor a flex op
<unknown>:0: note: loc(fused["StatefulPartitionedCall:", "StatefulPartitionedCall"]): called from
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: note: Error code: ERROR_NEEDS_FLEX_OPS
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: error: 'tf.Conv2D' op is neither a custom op nor a flex op
<unknown>:0: note: loc(fused["StatefulPartitionedCall:", "StatefulPartitionedCall"]): called from
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: note: Error code: ERROR_NEEDS_FLEX_OPS
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: error: 'tf.Conv2D' op is neither a custom op nor a flex op
<unknown>:0: note: loc(fused["StatefulPartitionedCall:", "StatefulPartitionedCall"]): called from
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: note: Error code: ERROR_NEEDS_FLEX_OPS
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: error: 'tf.Conv2D' op is neither a custom op nor a flex op
<unknown>:0: note: loc(fused["StatefulPartitionedCall:", "StatefulPartitionedCall"]): called from
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: note: Error code: ERROR_NEEDS_FLEX_OPS
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: error: 'tf.Conv2D' op is neither a custom op nor a flex op
<unknown>:0: note: loc(fused["StatefulPartitionedCall:", "StatefulPartitionedCall"]): called from
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: note: Error code: ERROR_NEEDS_FLEX_OPS
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: error: 'tf.Conv2D' op is neither a custom op nor a flex op
<unknown>:0: note: loc(fused["StatefulPartitionedCall:", "StatefulPartitionedCall"]): called from
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: note: Error code: ERROR_NEEDS_FLEX_OPS
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: error: 'tf.Conv2D' op is neither a custom op nor a flex op
<unknown>:0: note: loc(fused["StatefulPartitionedCall:", "StatefulPartitionedCall"]): called from
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: note: Error code: ERROR_NEEDS_FLEX_OPS
<unknown>:0: error: failed while converting: 'main': 
Some ops are not supported by the native TFLite runtime, you can enable TF kernels fallback using TF Select. See instructions: https://www.tensorflow.org/lite/guide/ops_select 
TF Select ops: Conv2D
Details:
	tf.Conv2D(tensor<?x?x?x?xf32>, tensor<1x1x1280x1280xf32>) -> (tensor<?x?x?x1280xf32>) : {data_format = "NHWC", device = "", dilations = [1, 1, 1, 1], explicit_paddings = [], padding = "VALID", strides = [1, 1, 1, 1], use_cudnn_on_gpu = true}
	tf.Conv2D(tensor<?x?x?x?xf32>, tensor<1x1x320x320xf32>) -> (tensor<?x?x?x320xf32>) : {data_format = "NHWC", device = "", dilations = [1, 1, 1, 1], explicit_paddings = [], padding = "VALID", strides = [1, 1, 1, 1], use_cudnn_on_gpu = true}
	tf.Conv2D(tensor<?x?x?x?xf32>, tensor<1x1x640x640xf32>) -> (tensor<?x?x?x640xf32>) : {data_format = "NHWC", device = "", dilations = [1, 1, 1, 1], explicit_paddings = [], padding = "VALID", strides = [1, 1, 1, 1], use_cudnn_on_gpu = true}
	tf.Conv2D(tensor<?x?x?x?xf32>, tensor<3x3x1280x1280xf32>) -> (tensor<?x?x?x1280xf32>) : {data_format = "NHWC", device = "", dilations = [1, 1, 1, 1], explicit_paddings = [], padding = "VALID", strides = [1, 1, 1, 1], use_cudnn_on_gpu = true}
	tf.Conv2D(tensor<?x?x?x?xf32>, tensor<3x3x1280x640xf32>) -> (tensor<?x?x?x640xf32>) : {data_format = "NHWC", device = "", dilations = [1, 1, 1, 1], explicit_paddings = [], padding = "VALID", strides = [1, 1, 1, 1], use_cudnn_on_gpu = true}
	tf.Conv2D(tensor<?x?x?x?xf32>, tensor<3x3x1920x1280xf32>) -> (tensor<?x?x?x1280xf32>) : {data_format = "NHWC", device = "", dilations = [1, 1, 1, 1], explicit_paddings = [], padding = "VALID", strides = [1, 1, 1, 1], use_cudnn_on_gpu = true}
	tf.Conv2D(tensor<?x?x?x?xf32>, tensor<3x3x1920x640xf32>) -> (tensor<?x?x?x640xf32>) : {data_format = "NHWC", device = "", dilations = [1, 1, 1, 1], explicit_paddings = [], padding = "VALID", strides = [1, 1, 1, 1], use_cudnn_on_gpu = true}
	tf.Conv2D(tensor<?x?x?x?xf32>, tensor<3x3x2560x1280xf32>) -> (tensor<?x?x?x1280xf32>) : {data_format = "NHWC", device = "", dilations = [1, 1, 1, 1], explicit_paddings = [], padding = "VALID", strides = [1, 1, 1, 1], use_cudnn_on_gpu = true}
	tf.Conv2D(tensor<?x?x?x?xf32>, tensor<3x3x320x320xf32>) -> (tensor<?x?x?x320xf32>) : {data_format = "NHWC", device = "", dilations = [1, 1, 1, 1], explicit_paddings = [], padding = "VALID", strides = [1, 1, 1, 1], use_cudnn_on_gpu = true}
	tf.Conv2D(tensor<?x?x?x?xf32>, tensor<3x3x320x4xf32>) -> (tensor<?x?x?x4xf32>) : {data_format = "NHWC", device = "", dilations = [1, 1, 1, 1], explicit_paddings = [], padding = "VALID", strides = [1, 1, 1, 1], use_cudnn_on_gpu = true}
	tf.Conv2D(tensor<?x?x?x?xf32>, tensor<3x3x320x640xf32>) -> (tensor<?x?x?x640xf32>) : {data_format = "NHWC", device = "", dilations = [1, 1, 1, 1], explicit_paddings = [], padding = "VALID", strides = [1, 1, 1, 1], use_cudnn_on_gpu = true}
	tf.Conv2D(tensor<?x?x?x?xf32>, tensor<3x3x640x1280xf32>) -> (tensor<?x?x?x1280xf32>) : {data_format = "NHWC", device = "", dilations = [1, 1, 1, 1], explicit_paddings = [], padding = "VALID", strides = [1, 1, 1, 1], use_cudnn_on_gpu = true}
	tf.Conv2D(tensor<?x?x?x?xf32>, tensor<3x3x640x320xf32>) -> (tensor<?x?x?x320xf32>) : {data_format = "NHWC", device = "", dilations = [1, 1, 1, 1], explicit_paddings = [], padding = "VALID", strides = [1, 1, 1, 1], use_cudnn_on_gpu = true}
	tf.Conv2D(tensor<?x?x?x?xf32>, tensor<3x3x640x640xf32>) -> (tensor<?x?x?x640xf32>) : {data_format = "NHWC", device = "", dilations = [1, 1, 1, 1], explicit_paddings = [], padding = "VALID", strides = [1, 1, 1, 1], use_cudnn_on_gpu = true}
	tf.Conv2D(tensor<?x?x?x?xf32>, tensor<3x3x960x320xf32>) -> (tensor<?x?x?x320xf32>) : {data_format = "NHWC", device = "", dilations = [1, 1, 1, 1], explicit_paddings = [], padding = "VALID", strides = [1, 1, 1, 1], use_cudnn_on_gpu = true}
	tf.Conv2D(tensor<?x?x?x?xf32>, tensor<3x3x960x640xf32>) -> (tensor<?x?x?x640xf32>) : {data_format = "NHWC", device = "", dilations = [1, 1, 1, 1], explicit_paddings = [], padding = "VALID", strides = [1, 1, 1, 1], use_cudnn_on_gpu = true}

II- Go through h5

Save the model with format h5

from stable_diffusion_tf.stable_diffusion import StableDiffusion
model = StableDiffusion(
    img_height=512,
    img_width=512,
)
model.diffusion_model.save('./stable_diffusion.h5', save_format='h5')

Lets try to load it

import tensorflow as tf

model2 = tf.keras.models.load_model('stable_diffusion.h5')
converter = tf.lite.TFLiteConverter.from_keras_model(model2)
tflite_model = converter.convert()
open("converted_model.tflite", "wb").write(tflite_model)

It seems that the TF 2.11.0 does not load h5 files anymore.

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
[<ipython-input-3-72d9a214a713>](https://localhost:8080/#) in <module>
      1 import tensorflow as tf
      2 
----> 3 model2 = tf.keras.models.load_model('stable_diffusion.h5')
      4 converter = tf.lite.TFLiteConverter.from_keras_model(model2)
      5 tflite_model = converter.convert()

1 frames
[/usr/local/lib/python3.7/dist-packages/keras/saving/legacy/serialization.py](https://localhost:8080/#) in class_and_config_for_serialized_keras_object(config, module_objects, custom_objects, printable_module_name)
    384     if cls is None:
    385         raise ValueError(
--> 386             f"Unknown {printable_module_name}: '{class_name}'. "
    387             "Please ensure you are using a `keras.utils.custom_object_scope` "
    388             "and that this object is included in the scope. See "

ValueError: Unknown layer: 'UNetModel'. Please ensure you are using a `keras.utils.custom_object_scope` and that this object is included in the scope. See https://www.tensorflow.org/guide/keras/save_and_serialize#registering_the_custom_object for details.

Therefore, uninstall tf 2.11.0 and install tf 2.1.0
Attempt to load the saved h5 file:

import tensorflow as tf

model2 = tf.keras.models.load_model('stable_diffusion.h5')
converter = tf.lite.TFLiteConverter.from_keras_model(model2)
tflite_model = converter.convert()
open("converted_model.tflite", "wb").write(tflite_model)

The load_model throws the following error:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-5-72d9a214a713> in <module>
      1 import tensorflow as tf
      2 
----> 3 model2 = tf.keras.models.load_model('stable_diffusion.h5')
      4 converter = tf.lite.TFLiteConverter.from_keras_model(model2)
      5 tflite_model = converter.convert()

1 frames
/usr/local/lib/python3.7/dist-packages/tensorflow_core/python/keras/saving/hdf5_format.py in load_model_from_hdf5(filepath, custom_objects, compile)
    164     if model_config is None:
    165       raise ValueError('No model found in config file.')
--> 166     model_config = json.loads(model_config.decode('utf-8'))
    167     model = model_config_lib.model_from_config(model_config,
    168                                                custom_objects=custom_objects)

AttributeError: 'str' object has no attribute 'decode'

Nov 21 '22 04:11 charbull

Thanks @charbull for the report! Will take a look at this. TFLite conversion would be an awesome addition.

Nov 21 '22 04:11 LukeWood

Leaving another conversion issue here related to the tf lite max size 2G related to protobuf: https://github.com/divamgupta/stable-diffusion-tensorflow/issues/58#issuecomment-1321195831

Nov 21 '22 04:11 charbull

Looking forward to have a tf lite and stablediffusion on the coral :)

Nov 21 '22 04:11 charbull

@charbull

I was able to convert text_encoder, diffusion_model, and encoder to tflite. There are some issues.

you have to specify batch size some where, otherwise, you got error messages you show. Do it either in model initialization or concrete_function you get from your saved_model

ConverterError: /opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: error: 'tf.Conv2D' op is neither a custom op nor a flex op
<unknown>:0: note: loc(fused["StatefulPartitionedCall:", "StatefulPartitionedCall"]): called from
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: note: Error code: ERROR_NEEDS_FLEX_OPS
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: error: 'tf.Conv2D' op is neither a custom op nor a flex op
<unknown>:0: note: loc(fused["StatefulPartitionedCall:", "StatefulPartitionedCall"]): called from
/opt/conda/envs/rapids/lib/python3.8/site-packages/tensorflow/python/saved_model/save.py:1267:0: note: Error code: ERROR_NEEDS_FLEX_OPS

TensoFlow override len(), so you may have to change the _create_broadcast_shape in stable_diffusion/__internal__/layers/group_normalization.py from

 def _create_broadcast_shape(self, input_shape):
        broadcast_shape = [1] * len(input_shape)
        broadcast_shape[self.axis] = input_shape[self.axis] // self.groups
        broadcast_shape.insert(self.axis, self.groups)
        return broadcast_shape

to

def _create_broadcast_shape(self, input_shape):
        broadcast_shape = [1] * input_shape.shape.rank
        broadcast_shape[self.axis] = input_shape[self.axis] // self.groups
        broadcast_shape.insert(self.axis, self.groups)
        return broadcast_shape

if you go thru concrete_function path, you might run into 2 GiB file size limitation in protobuf. If you go with from_keras_model or from_saved_model, you may run into 2 GiB limitation of FlatBuffer when converting diffusion model. The only compatible tflite solution I can find is to convert to fp16 (so the file size would be < 2 GiB)

Nov 21 '22 09:11 freedomtan

Thank you @freedomtan I am trying those now, created this PR: https://github.com/keras-team/keras-cv/pull/1035 based on your suggestion for the length.

Nov 21 '22 12:11 charbull

Hi @freedomtan how do you set the batch size ? the arguments are:

  img_height: Height of the images to generate, in pixel. Note that only
            multiples of 128 are supported; the value provided will be rounded
            to the nearest valid value. Default: 512.
        img_width: Width of the images to generate, in pixel. Note that only
            multiples of 128 are supported; the value provided will be rounded
            to the nearest valid value. Default: 512.
        jit_compile: Whether to compile the underlying models to XLA.
            This can lead to a significant speedup on some systems. Default: False.

Nov 21 '22 15:11 charbull

@charbull batch_size isn't configured at a StableDiffusion-wide level in our implementation, but rather passed as a parameter to specific SD flows (e.g. the text_to_image method).

Nov 21 '22 20:11 ianstenbit

@ianstenbit oh I see ! thank you for the clarification. I am not there yet. trying to convert to TFlite first :)

Nov 21 '22 20:11 charbull

I wonder why the 2GB limitation will be applied here, since weights is not part of the proto?

Nov 21 '22 20:11 tanzhenyu

Hi @freedomtan how do you set the batch size ? the arguments are:

  img_height: Height of the images to generate, in pixel. Note that only
            multiples of 128 are supported; the value provided will be rounded
            to the nearest valid value. Default: 512.
        img_width: Width of the images to generate, in pixel. Note that only
            multiples of 128 are supported; the value provided will be rounded
            to the nearest valid value. Default: 512.
        jit_compile: Whether to compile the underlying models to XLA.
            This can lead to a significant speedup on some systems. Default: False.

Yup, you have to modify the source code, either adding an argument or directly modifying Input layers work :-)

Nov 22 '22 01:11 freedomtan

I wonder why the 2GB limitation will be applied here, since weights is not part of the proto?

Yup, that's not trivial. In TensorFlow 2.10 and before (I didn't check if it is changed in 2.11 or later), it seems when you try to convert a model to tflite from concrete function, it is converted to frozen graphdef first. Thus, there is protobuf 2 GiB limitation.

Nov 22 '22 01:11 freedomtan

I think in master is https://github.com/tensorflow/tensorflow/blob/master/tensorflow/compiler/mlir/lite/flatbuffer_export.cc#L2152-L2156

Nov 22 '22 01:11 bhack

I wonder why the 2GB limitation will be applied here, since weights is not part of the proto?

Yup, that's not trivial. In TensorFlow 2.10 and before (I didn't check if it is changed in 2.11 or later), it seems when you try to convert a model to tflite from concrete function, it is converted to frozen graphdef first. Thus, there is protobuf 2 GiB limitation.

I see, that makes sense. Though in the future, I'd really hope this is part of logic will get done at tf lite runtime, instead of model saving.

Nov 22 '22 01:11 tanzhenyu

I think in master is https://github.com/tensorflow/tensorflow/blob/master/tensorflow/compiler/mlir/lite/flatbuffer_export.cc#L2152-L2156

you mean the 2GB limit is still true across 2.10 and master, correct?

Nov 22 '22 01:11 tanzhenyu

you mean the 2GB limit is still true across 2.10 and master, correct?

Yes that link is pointing to master. Then if we see the conversion workflow... https://www.tensorflow.org/lite/models/convert?hl=en#model_conversion

Nov 22 '22 01:11 bhack

I think in master is https://github.com/tensorflow/tensorflow/blob/master/tensorflow/compiler/mlir/lite/flatbuffer_export.cc#L2152-L2156

you mean the 2GB limit is still true across 2.10 and master, correct?

Nope, there are two 2 GiB limitations, one is protobuf, the other is flatbuffer, which is file format of TFLite. As far as I know, there is no way to get around the flatbuffer 2 GiB limit without breaking compatibility.

Nov 22 '22 01:11 freedomtan

you mean the 2GB limit is still true across 2.10 and master, correct?

Yes that link is pointing to master. Then if we see the conversion workflow... https://www.tensorflow.org/lite/models/convert?hl=en#model_conversion

Right, I think this is suboptimal, freezing can be done at runtime, I will try to figure out how to push this forward with TFLite separately.

Nov 22 '22 01:11 tanzhenyu

But I think that probably we could still solve using something like https://github.com/tensorflow/hub/blob/master/examples/text_embeddings/export.py#L204-L219. What do you think?

Nov 22 '22 01:11 bhack

As I saw another case pointing to that workaround (https://github.com/tensorflow/tensorflow/issues/47326#issuecomment-788298828)

Nov 22 '22 01:11 bhack

But I think that probably we could still solve using something like https://github.com/tensorflow/hub/blob/master/examples/text_embeddings/export.py#L204-L219. What do you think?

Hmm...that's basically rewriting the graph and use placeholder instead....is this even do-able at TF2?

Nov 22 '22 01:11 tanzhenyu

@abattery What do you think?

Nov 22 '22 02:11 bhack

I was able to make some progress based on advice from @costiash here: https://github.com/divamgupta/stable-diffusion-tensorflow/issues/58#issuecomment-1321547438

import tensorflow as tf

model2 = tf.keras.models.load_model('/content/saved_model')
converter = tf.lite.TFLiteConverter.from_keras_model(model2)
converter.target_spec.supported_ops = [
  tf.lite.OpsSet.TFLITE_BUILTINS, # enable TensorFlow Lite ops.
  tf.lite.OpsSet.SELECT_TF_OPS # enable TensorFlow ops.]
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.target_spec.supported_types = [tf.float16]
tflite_model = converter.convert()
open("converted_model.tflite", "wb").write(tflite_model)

It does generates an TFlite with 16float ~ 1.6 GB size file.

However, since I am running tflite on edge tpu, It seems I need to go full quantization with int8. https://www.tensorflow.org/lite/performance/post_training_quantization#integer_only

The https://www.tensorflow.org/api_docs/python/tf/lite/RepresentativeDataset is needed for full quantization, according to the reference:

Usually, this is a small subset of a few hundred samples randomly chosen, in no particular order, from the training or evaluation dataset.

Are there samples I can use from this project? or do I need generate some images from TF stable diffusion and use them?

Nov 22 '22 04:11 charbull

@charbull, my 2 cents: DO NOT use tf.lite.OpsSet.SELECT_TF_OPS, which is to allow TF ops. Since your goal is to run on EdgeTPU, TF ops are unlikely to work on it.

With recently tf master + keras_cv (0.3.4 + group norm patch), converting from keras model works like a charm.

import keras_cv
from tensorflow import keras
import tensorflow as tf

model = keras_cv.models.StableDiffusion(img_width=512, img_height=512)

converter = tf.lite.TFLiteConverter.from_keras_model(model.diffusion_model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.target_spec.supported_types = [tf.float16]
tflite_model = converter.convert()

with open('/tmp/diffusion_model_fp16.tflite', 'wb') as f:
  f.write(tflite_model)

The TF master I tested was built from https://github.com/tensorflow/tensorflow/commit/680a9b2a9ae91e3386c1ba6be0de077d7e4b1773

Nov 22 '22 05:11 freedomtan

@freedomtan thank you ! we are getting closer.

It turns out I need to run the following for the edge TPU quantization : https://www.tensorflow.org/lite/performance/post_training_quantization#integer_only

import tensorflow as tf

model = tf.keras.models.load_model('./saved_model')
converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.inference_input_type = tf.int8  # or tf.uint8
converter.inference_output_type = tf.int8  # or tf.uint8
converter.representative_dataset = representative_data_gen()
tflite_model_int8 = converter.convert()
open("converted_model-int8.tflite", "wb").write(tflite_model)

trying to figure out how to get the representative_data_gen does it need to be on the training data? according to the documentation:

Usually, this is a small subset of a few hundred samples randomly chosen, in no particular order, 
from the training or evaluation dataset.

Are there samples I can use from this project? or do I need generate some images from TF stable diffusion and I am not quite sure.

Cheers,

Nov 22 '22 12:11 charbull

I don't remember what checkpoint we have used: https://huggingface.co/CompVis/stable-diffusion

Probably for the calibration you can pass prompts samples from: "laion-improved-aesthetics" or "laion-aesthetics v2 5+"

https://laion.ai/blog/laion-aesthetics/

Nov 22 '22 13:11 bhack

@bhack thank you ! will give it a try :)

Nov 22 '22 13:11 charbull

Hi,

I tried the following so far with @ianstenbit

I. I generated images from prompts and put them in a csv file so that I can prepare the representative_dataset

def load_img(path_to_img):
   return tf.io.read_file(path_to_img)


def prepare_images():
    df = pd.read_csv('./represent_data/representative.csv')
    images = []
    for index, row in df.iterrows():
         print(row['output'])
         img = load_img(row['output'])
         images.append(img)
    return images

def prepare_prompts():
    df = pd.read_csv('./represent_data/representative.csv')
    prompts = []
    for index, row in df.iterrows():
         print(row['input'])
         prompts.append(str(row['input']))
    return prompts

def representative_data_gen():
  prompts = prepare_prompts()
  for prompt in prompts:
    yield [prompt, model.text_to_image(prompt)]

Then the conversion:

import tensorflow as tf

model = tf.keras.models.load_model('./saved_model')
converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.inference_input_type = tf.int8  # or tf.uint8
converter.inference_output_type = tf.int8  # or tf.uint8
# converter.experimental_new_quantizer = True
converter.representative_dataset = representative_data_gen()
tflite_model_int8 = converter.convert()
open("converted_model-int8.tflite", "wb").write(tflite_model_int8)

Getting the following error, not sure what is the issue:

WARNING:tensorflow:No training configuration found in save file, so the model was *not* compiled. Compile it manually.
WARNING:absl:Found untraced functions such as conv2d_84_layer_call_fn, conv2d_84_layer_call_and_return_conditional_losses, _jit_compiled_convolution_op, restored_function_body, restored_function_body while saving (showing 5 of 1200). These functions will not be directly callable after loading.
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
[<ipython-input-9-dc50ce399a16>](https://localhost:8080/#) in <module>
      8 # converter.experimental_new_quantizer = True
      9 converter.representative_dataset = representative_data_gen()
---> 10 tflite_model_int8 = converter.convert()
     11 open("converted_model-int8.tflite", "wb").write(tflite_model_int8)

8 frames
[/usr/local/lib/python3.7/dist-packages/tensorflow/lite/python/lite.py](https://localhost:8080/#) in _validate_inference_input_output_types(self, quant_mode)
    963     elif (self.inference_input_type not in default_types or
    964           self.inference_output_type not in default_types):
--> 965       raise ValueError("The inference_input_type and inference_output_type "
    966                        "must be tf.float32.")
    967 

ValueError: The inference_input_type and inference_output_type must be tf.float32.

II. Tried to cut the conversion for the encoder alone:

converter = tf.lite.TFLiteConverter.from_keras_model(model.text_encoder)

which also produced the same error:

ValueError: The inference_input_type and inference_output_type must be tf.float32.

Any ideas what is going wrong?

Thank you

Nov 23 '22 00:11 charbull

As discussed, I think what you stored in ./saved_model is just the latent diffusion model, not the whole model.

In order to convert the full StableDiffusion model, we'll need to either

Implement call on StableDiffusion (I think this is probably not the right approach because it confines us to one use case)
Convert the individual component models to TFLite and compose them into the StableDiffusion object post-conversion (perhaps with some custom adaptation for the TFLite versions of the component models -- I'm not sure)

Nov 23 '22 00:11 ianstenbit

@charbull What you saved in saved_model is fp32 model. If you check input tensors of the diffusion_model with saved_model_cli, you can see that the expected data types of the 3 input tensors is fp32 (DT_FLOAT). And as @ianstenbit noted, what you saved is the diffusion/denoise model instead of the whole pipeline. The input data for the diffusion model are supposed to be from the text_encoder and the random number generator.

$ saved_model_cli show --all --dir /tmp/sd/diffusion_model
...
signature_def['serving_default']:
  The given SavedModel SignatureDef contains the following input(s):
    inputs['input_1'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, 77, 768)
        name: serving_default_input_1:0
    inputs['input_2'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, 320)
        name: serving_default_input_2:0
    inputs['input_3'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, 64, 64, 4)
        name: serving_default_input_3:0
  The given SavedModel SignatureDef contains the following output(s):
    outputs['padded_conv2d_83'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, 64, 64, 4)
        name: StatefulPartitionedCall:0
  Method name is: tensorflow/serving/predict
...

Nov 23 '22 00:11 freedomtan

@freedomtan @ianstenbit I see.

What would be the best approach to get to tflite int8 in this case. I didn't try this before with decomposing and reassembling pieces of the model.

Would the edge tpu "knows" how to handle this? There is also the Tflite interpreter.

Not sure exactly what would be the best way going forward ? :)

If you can highlight the steps and the library/tools, I am happy to give it a shot.

Cheers

Nov 23 '22 01:11 charbull