keras-ocr icon indicating copy to clipboard operation
keras-ocr copied to clipboard

Very inaccurate results with keras-ocr tflite model

Open yashcfg opened this issue 1 year ago • 1 comments

I have a requirement where I want to detect text in my android app

I followed fine tuning the recognizer guide and trained my recognizer with borndigital dataset.

Then I created the tflite version of the prediction_model from the previous step -

tflite_name = 'test1.tflite'

converter = tf.lite.TFLiteConverter.from_keras_model(recognizer.prediction_model)
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS, tf.lite.OpsSet.SELECT_TF_OPS]
converter._experimental_lower_tensor_list_ops = False
converter.target_spec.supported_types = [tf.float32]
tflite_model = converter.convert()
open(tflite_name, "wb").write(tflite_model)

def run_tflite_model(image_path, quantization):
    input_data = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)
    input_data = cv2.resize(input_data, (200, 31))
    input_data = input_data[np.newaxis]
    input_data = np.expand_dims(input_data, 3)
    input_data = input_data.astype('float32')/255
    interpreter = tf.lite.Interpreter(model_path=tflite_name)
    interpreter.allocate_tensors()

    # Get input and output tensors.
    input_details = interpreter.get_input_details()
    output_details = interpreter.get_output_details()

    input_shape = input_details[0]['shape']
    interpreter.set_tensor(input_details[0]['index'], input_data)

    interpreter.invoke()

    output = interpreter.get_tensor(output_details[0]['index'])
    return output


alphabets = ["0","1","2","3","4","5","6","7","8","9","a","b","c","d","e","f","g","h","i","j","k","l","m","n","o","p","q","r","s","t","u","v","w","x","y","z"]
blank_index = 36

for i in range(1, 20):
    image_path = 'borndigital/test/word_' + str(i) + '.png'
    tflite_output = run_tflite_model(image_path, 'dr')
    final_output = "".join(alphabets[index] for index in tflite_output[0] if index not in [blank_index, -1])
    print("lite model - " + final_output)
    predicted = recognizer.recognize(image_path)
    print("non lite model - " + predicted)

I expected a little difference in the results but did not expect them to be this huge (from the logs)-

lite model - loeaxolea
non lite model - bada

lite model - deveioper
non lite model - developer

lite model - ldedoscaly
non lite model - day

lite model - hhonors
non lite model - hhonors

lite model - nomusnon
non lite model - mluron

lite model - waniewoe
non lite model - wonlwoe

lite model - ihiarsnls
non lite model - thank

lite model - vicodiilino
non lite model - you

lite model - lirayuvrel
non lite model - travel

lite model - insurance
non lite model - insurance

lite model - ineclilil
non lite model - will

lite model - rirgtecte
non lite model - protect

lite model - myefscrdsilirin
non lite model - you

lite model - emdncnl
non lite model - and

lite model - yicriens
non lite model - your

lite model - inelfllegads
non lite model - trip

lite model - lfifosirm
non lite model - from

lite model - lmcexkse
non lite model - unex

lite model - ercecieegc
non lite model - pectd

As you can see in almost every case tflite model predicted wrong whereas the original recognizer's predictions are alright.

Can you suggest what changes I can do to improve the accuracy?

If you need more info, I can provide complete code and logs. All the trainings and predictions are on Mac OSX M1 with following versions -

tensorflow-datasets       4.8.3
tensorflow-deps           2.10.0
tensorflow-estimator      2.9.0
tensorflow-macos          2.9.0
tensorflow-metadata       1.12.0 
tensorflow-metal          0.5.0 

Thanks

yashcfg avatar Mar 29 '23 07:03 yashcfg