FastDeploy Fastdeploy OCR model accuracy less than PPOCR model

Environment

FastDeploy version: 1.0.0 OS Platform: Linux x64 Hardware: Nvidia T4 Program Language: e.g. Python 3.8

Problem description

Results of fastdeploy PPOCRV3 models are less accurate that inference from PPOCR library runtime. I am not able to identify the reason and if that is the expected result.

Dec 01 '22 18:12 akansal1

Environment

FastDeploy version: 1.0.0 OS Platform: Linux x64 Hardware: Nvidia T4 Program Language: e.g. Python 3.8

Problem description

Results of fastdeploy PPOCRV3 models are less accurate that inference from PPOCR library runtime. I am not able to identify the reason and if that is the expected result.

Hello,

1.When you say PPOCR library runtime, do you mean PaddleOCR? (https://github.com/PaddlePaddle/PaddleOCR/tree/release/2.6/deploy/cpp_infer）

About the "less accurate" , I'm not sure how much accuracy you've lost. Could you show me the result from both FastDeploy and PPOCR library runtime? Or Could you show me the image that you predict?
When you inference from PaddleOCR, the batch size of recognition model is set to 6, but in FastDeploy, the batch size is not fixed, which will case some difference in accuracy.

Dec 02 '22 00:12 yunyaoXYY

@yunyaoXYY you use following to reproduce:

Inference via PaddleOCR

from paddleocr import PaddleOCR,draw_ocr
import cv2

img_path = <jpeg_path>

ocr = PaddleOCR(use_angle_cls=True, lang='en')
ocr(cv2.imread(img_path))

[2022/12/02 01:56:54] ppocr DEBUG: dt_boxes num : 28, elapse : 0.3203616142272949
[2022/12/02 01:56:54] ppocr DEBUG: cls num  : 28, elapse : 0.18441128730773926
[2022/12/02 01:56:58] ppocr DEBUG: rec_res num  : 28, elapse : 3.2916038036346436

Result of PaddleOCR

{
  "R4.. 12:30": 0.7997417449951172,
  "( %.00": 0.7055521011352539,
  "79": 0.8551740050315857,
  "Melbourne - Fleet & Other Vehicles": 0.9827376008033752,
  "All": 0.8694536089897156,
  "Watches / Notes": 0.9373573064804077,
  "206 CP 09/20, Built 09/20, Toyota, Camry": 0.9854691624641418,
  "ASV70R Ascent, Sedan, 5 Seats, 4 Doors": 0.9811344146728516,
  "35,002Kms": 0.9965813159942627,
  "REFER $27,750": 0.9993845820426941,
  "207 CP 06/19. Built 04/19.Subaru.Forester. S5": 0.9764477014541626,
  "MY19 2.5i-S CVT AWD, Wagon, 5 Seats, 5 Doors": 0.9877890944480896,
  "87,290Kms": 0.9946824312210083,
  "REFER $31,000": 0.9942302107810974,
  "208 CP 10/18,Built 05/18, Kia, Sportage, QL": 0.9710363149642944,
  "MY18 Si AWD Premium,Wagon, 5 Seats, 5": 0.9965835809707642,
  "Doors": 0.9997485876083374,
  "53,910Kms": 0.9935662746429443,
  "REFER": 0.999330997467041,
  "$24,250": 0.9978954195976257,
  "209 CP 07/17,Built 05/17,Holden,Trax,TJ": 0.9638256430625916,
  "MY17 LT,Wagon, 5 Seats, 5 Doors": 0.9763248562812805,
  "64,631Kms": 0.9987785816192627,
  "REFER $16,500": 0.9978040456771851
}

Inference via FastDeploy

import fastdeploy as fd
from fastdeploy import c_lib_wrap as C
import cv2

option = fd.RuntimeOption()


rec_option=option.set_trt_input_shape("x", [1, 3, 48, 10], [10, 3, 48, 320],
                               [64, 3, 48, 2304])

rec_model = fd.vision.ocr.Recognizer(
    '/fastdeploy_models/rec_runtime/1/model.pdmodel',
    '/fastdeploy_models/rec_runtime/1/model.pdiparams',
    '/fastdeploy_models/rec_postprocess/1/en_dict.txt',
    runtime_option=rec_option)

option.is_dynamic=True
option.set_trt_input_shape("x", [1, 3, 64, 64], [1, 3, 640, 640],
                               [1, 3, 1544, 1544])
det_model = fd.vision.ocr.DBDetector(
    '/fastdeploy_models/det_runtime/1/model.pdmodel', 
    '/fastdeploy_models/det_runtime/1/model.pdiparams', runtime_option=option)

img_path = "<jpeg_path>"
im=cv2.imread(img_path)

sys=C.vision.ocr.PPOCRv3(det_model._model,rec_model._model)
sys.predict(im[...,])

Results of FastDeploy

{
  "R 12:30": 0.5980047583580017,
  "G": 0.15071693062782288,
  "": 0.0,
  "Melbourne - Fleet & Other Vehicles": 0.9961614012718201,
  "A": 0.7049339413642883,
  "Watches I Notes": 0.9241747260093689,
  "206 CP 09/20, Built 09/20, Toyota, Camry": 0.9750648140907288,
  "ASV70R Ascent, Sedan, 5 Seats, 4 Doors": 0.966465175151825,
  "3502Km": 0.5585656762123108,
  "REER27 50": 0.6220335364341736,
  "207 CP 06/19.Built 04/19.Subaru.Forester. S5": 0.9707357883453369,
  "MY19 2.5i-S CVT AWD, Wagon, 5 Seats, 5 Doors": 0.9739432334899902,
  "872 Ks": 0.34590622782707214,
  "REER3100": 0.6966407895088196,
  "208 CP 10/18, Built 05/18, Kia, Sportage, QL": 0.9705865979194641,
  "MY18 Si AWD Premium,Wagon, 5 Seats, 5": 0.9961617588996887,
  "DOO": 0.5707316994667053,
  "53 91K": 0.6855024099349976,
  "REFER": 0.6617550253868103,
  "$242 50": 0.549923837184906,
  "209 CP 07/17,Built 05/17,Holden,Trax,TJ": 0.981033205986023,
  "MY17 LT, Wagon, 5 Seats, 5 Doors": 0.9695397615432739,
  "64631K": 0.5456709861755371,
  "REFO16500": 0.5376154780387878,
  "1": 0.18550123274326324,
  "L": 0.14091278612613678
}

You can see that there is significant drop in accuracy and detections of the two methods. Following image was used at our end to produce these results.

sample_image

Dec 02 '22 02:12 akansal1

@yunyaoXYY you use following to reproduce:

Inference via PaddleOCR

from paddleocr import PaddleOCR,draw_ocr
import cv2

img_path = <jpeg_path>

ocr = PaddleOCR(use_angle_cls=True, lang='en')
ocr(cv2.imread(img_path))

[2022/12/02 01:56:54] ppocr DEBUG: dt_boxes num : 28, elapse : 0.3203616142272949
[2022/12/02 01:56:54] ppocr DEBUG: cls num  : 28, elapse : 0.18441128730773926
[2022/12/02 01:56:58] ppocr DEBUG: rec_res num  : 28, elapse : 3.2916038036346436

Result of PaddleOCR

{
  "R4.. 12:30": 0.7997417449951172,
  "( %.00": 0.7055521011352539,
  "79": 0.8551740050315857,
  "Melbourne - Fleet & Other Vehicles": 0.9827376008033752,
  "All": 0.8694536089897156,
  "Watches / Notes": 0.9373573064804077,
  "206 CP 09/20, Built 09/20, Toyota, Camry": 0.9854691624641418,
  "ASV70R Ascent, Sedan, 5 Seats, 4 Doors": 0.9811344146728516,
  "35,002Kms": 0.9965813159942627,
  "REFER $27,750": 0.9993845820426941,
  "207 CP 06/19. Built 04/19.Subaru.Forester. S5": 0.9764477014541626,
  "MY19 2.5i-S CVT AWD, Wagon, 5 Seats, 5 Doors": 0.9877890944480896,
  "87,290Kms": 0.9946824312210083,
  "REFER $31,000": 0.9942302107810974,
  "208 CP 10/18,Built 05/18, Kia, Sportage, QL": 0.9710363149642944,
  "MY18 Si AWD Premium,Wagon, 5 Seats, 5": 0.9965835809707642,
  "Doors": 0.9997485876083374,
  "53,910Kms": 0.9935662746429443,
  "REFER": 0.999330997467041,
  "$24,250": 0.9978954195976257,
  "209 CP 07/17,Built 05/17,Holden,Trax,TJ": 0.9638256430625916,
  "MY17 LT,Wagon, 5 Seats, 5 Doors": 0.9763248562812805,
  "64,631Kms": 0.9987785816192627,
  "REFER $16,500": 0.9978040456771851
}

Inference via FastDeploy

import fastdeploy as fd
from fastdeploy import c_lib_wrap as C
import cv2

option = fd.RuntimeOption()


rec_option=option.set_trt_input_shape("x", [1, 3, 48, 10], [10, 3, 48, 320],
                               [64, 3, 48, 2304])

rec_model = fd.vision.ocr.Recognizer(
    '/fastdeploy_models/rec_runtime/1/model.pdmodel',
    '/fastdeploy_models/rec_runtime/1/model.pdiparams',
    '/fastdeploy_models/rec_postprocess/1/en_dict.txt',
    runtime_option=rec_option)

option.is_dynamic=True
option.set_trt_input_shape("x", [1, 3, 64, 64], [1, 3, 640, 640],
                               [1, 3, 1544, 1544])
det_model = fd.vision.ocr.DBDetector(
    '/fastdeploy_models/det_runtime/1/model.pdmodel', 
    '/fastdeploy_models/det_runtime/1/model.pdiparams', runtime_option=option)

img_path = "<jpeg_path>"
im=cv2.imread(img_path)

sys=C.vision.ocr.PPOCRv3(det_model._model,rec_model._model)
sys.predict(im[...,])

Results of FastDeploy

{
  "R 12:30": 0.5980047583580017,
  "G": 0.15071693062782288,
  "": 0.0,
  "Melbourne - Fleet & Other Vehicles": 0.9961614012718201,
  "A": 0.7049339413642883,
  "Watches I Notes": 0.9241747260093689,
  "206 CP 09/20, Built 09/20, Toyota, Camry": 0.9750648140907288,
  "ASV70R Ascent, Sedan, 5 Seats, 4 Doors": 0.966465175151825,
  "3502Km": 0.5585656762123108,
  "REER27 50": 0.6220335364341736,
  "207 CP 06/19.Built 04/19.Subaru.Forester. S5": 0.9707357883453369,
  "MY19 2.5i-S CVT AWD, Wagon, 5 Seats, 5 Doors": 0.9739432334899902,
  "872 Ks": 0.34590622782707214,
  "REER3100": 0.6966407895088196,
  "208 CP 10/18, Built 05/18, Kia, Sportage, QL": 0.9705865979194641,
  "MY18 Si AWD Premium,Wagon, 5 Seats, 5": 0.9961617588996887,
  "DOO": 0.5707316994667053,
  "53 91K": 0.6855024099349976,
  "REFER": 0.6617550253868103,
  "$242 50": 0.549923837184906,
  "209 CP 07/17,Built 05/17,Holden,Trax,TJ": 0.981033205986023,
  "MY17 LT, Wagon, 5 Seats, 5 Doors": 0.9695397615432739,
  "64631K": 0.5456709861755371,
  "REFO16500": 0.5376154780387878,
  "1": 0.18550123274326324,
  "L": 0.14091278612613678
}

You can see that there is significant drop in accuracy and detections of the two methods. Following image was used at our end to produce these results.

sample_image

Got it, I will help to check this problem

Dec 02 '22 02:12 yunyaoXYY

@yunyaoXYY I tried the same on your sample image and got the following results:

{
  "FastDeploy": {
    "LRE": 0.3414466977119446,
    "tJYj 155": 0.5437671542167664,
    "1L4": 0.4430263042449951,
    "T252935": 0.9433053135871887
  },
  "PaddleOCR": {
    "15": 0.8533188104629517,
    "252935": 0.9908719658851624
  }
}

Sample Image

Dec 02 '22 02:12 akansal1

@yunyaoXYY I tried the same on your sample image and got the following results:

{
  "FastDeploy": {
    "LRE": 0.3414466977119446,
    "tJYj 155": 0.5437671542167664,
    "1L4": 0.4430263042449951,
    "T252935": 0.9433053135871887
  },
  "PaddleOCR": {
    "15": 0.8533188104629517,
    "252935": 0.9908719658851624
  }
}

Sample Image

I think if you wanna try to use PaddleOCR on such picture whit Chinese, you should change the label to : https://gitee.com/paddlepaddle/PaddleOCR/raw/release/2.6/ppocr/utils/ppocr_keys_v1.txt

Dec 02 '22 03:12 yunyaoXYY

@yunyaoXYY you use following to reproduce:

Inference via PaddleOCR

from paddleocr import PaddleOCR,draw_ocr
import cv2

img_path = <jpeg_path>

ocr = PaddleOCR(use_angle_cls=True, lang='en')
ocr(cv2.imread(img_path))

[2022/12/02 01:56:54] ppocr DEBUG: dt_boxes num : 28, elapse : 0.3203616142272949
[2022/12/02 01:56:54] ppocr DEBUG: cls num  : 28, elapse : 0.18441128730773926
[2022/12/02 01:56:58] ppocr DEBUG: rec_res num  : 28, elapse : 3.2916038036346436

Result of PaddleOCR

{
  "R4.. 12:30": 0.7997417449951172,
  "( %.00": 0.7055521011352539,
  "79": 0.8551740050315857,
  "Melbourne - Fleet & Other Vehicles": 0.9827376008033752,
  "All": 0.8694536089897156,
  "Watches / Notes": 0.9373573064804077,
  "206 CP 09/20, Built 09/20, Toyota, Camry": 0.9854691624641418,
  "ASV70R Ascent, Sedan, 5 Seats, 4 Doors": 0.9811344146728516,
  "35,002Kms": 0.9965813159942627,
  "REFER $27,750": 0.9993845820426941,
  "207 CP 06/19. Built 04/19.Subaru.Forester. S5": 0.9764477014541626,
  "MY19 2.5i-S CVT AWD, Wagon, 5 Seats, 5 Doors": 0.9877890944480896,
  "87,290Kms": 0.9946824312210083,
  "REFER $31,000": 0.9942302107810974,
  "208 CP 10/18,Built 05/18, Kia, Sportage, QL": 0.9710363149642944,
  "MY18 Si AWD Premium,Wagon, 5 Seats, 5": 0.9965835809707642,
  "Doors": 0.9997485876083374,
  "53,910Kms": 0.9935662746429443,
  "REFER": 0.999330997467041,
  "$24,250": 0.9978954195976257,
  "209 CP 07/17,Built 05/17,Holden,Trax,TJ": 0.9638256430625916,
  "MY17 LT,Wagon, 5 Seats, 5 Doors": 0.9763248562812805,
  "64,631Kms": 0.9987785816192627,
  "REFER $16,500": 0.9978040456771851
}

Inference via FastDeploy

import fastdeploy as fd
from fastdeploy import c_lib_wrap as C
import cv2

option = fd.RuntimeOption()


rec_option=option.set_trt_input_shape("x", [1, 3, 48, 10], [10, 3, 48, 320],
                               [64, 3, 48, 2304])

rec_model = fd.vision.ocr.Recognizer(
    '/fastdeploy_models/rec_runtime/1/model.pdmodel',
    '/fastdeploy_models/rec_runtime/1/model.pdiparams',
    '/fastdeploy_models/rec_postprocess/1/en_dict.txt',
    runtime_option=rec_option)

option.is_dynamic=True
option.set_trt_input_shape("x", [1, 3, 64, 64], [1, 3, 640, 640],
                               [1, 3, 1544, 1544])
det_model = fd.vision.ocr.DBDetector(
    '/fastdeploy_models/det_runtime/1/model.pdmodel', 
    '/fastdeploy_models/det_runtime/1/model.pdiparams', runtime_option=option)

img_path = "<jpeg_path>"
im=cv2.imread(img_path)

sys=C.vision.ocr.PPOCRv3(det_model._model,rec_model._model)
sys.predict(im[...,])

Results of FastDeploy

{
  "R 12:30": 0.5980047583580017,
  "G": 0.15071693062782288,
  "": 0.0,
  "Melbourne - Fleet & Other Vehicles": 0.9961614012718201,
  "A": 0.7049339413642883,
  "Watches I Notes": 0.9241747260093689,
  "206 CP 09/20, Built 09/20, Toyota, Camry": 0.9750648140907288,
  "ASV70R Ascent, Sedan, 5 Seats, 4 Doors": 0.966465175151825,
  "3502Km": 0.5585656762123108,
  "REER27 50": 0.6220335364341736,
  "207 CP 06/19.Built 04/19.Subaru.Forester. S5": 0.9707357883453369,
  "MY19 2.5i-S CVT AWD, Wagon, 5 Seats, 5 Doors": 0.9739432334899902,
  "872 Ks": 0.34590622782707214,
  "REER3100": 0.6966407895088196,
  "208 CP 10/18, Built 05/18, Kia, Sportage, QL": 0.9705865979194641,
  "MY18 Si AWD Premium,Wagon, 5 Seats, 5": 0.9961617588996887,
  "DOO": 0.5707316994667053,
  "53 91K": 0.6855024099349976,
  "REFER": 0.6617550253868103,
  "$242 50": 0.549923837184906,
  "209 CP 07/17,Built 05/17,Holden,Trax,TJ": 0.981033205986023,
  "MY17 LT, Wagon, 5 Seats, 5 Doors": 0.9695397615432739,
  "64631K": 0.5456709861755371,
  "REFO16500": 0.5376154780387878,
  "1": 0.18550123274326324,
  "L": 0.14091278612613678
}

You can see that there is significant drop in accuracy and detections of the two methods. Following image was used at our end to produce these results.

sample_image

Hi, I found one reason why there exists difference in accuracy is the batch size of FastDeploy PP-OCR is not equal to that of PaddleOCR. We will fix it and let user set the batch size of PP-OCR in FastDeploy, which will make the accuracy match that of PaddleOCR.

Dec 04 '22 11:12 yunyaoXYY

您好，请问您这个问题最后得到解决了吗

Nov 26 '23 13:11 huangjun11

FastDeploy FastDeploy copied to clipboard

Fastdeploy OCR model accuracy less than PPOCR model

Environment

Problem description

Environment

Problem description

FastDeploy
FastDeploy copied to clipboard