FastDeploy
FastDeploy copied to clipboard
Fastdeploy OCR model accuracy less than PPOCR model
Environment
FastDeploy version: 1.0.0 OS Platform: Linux x64 Hardware: Nvidia T4 Program Language: e.g. Python 3.8
Problem description
Results of fastdeploy PPOCRV3 models are less accurate that inference from PPOCR library runtime. I am not able to identify the reason and if that is the expected result.
Environment
FastDeploy version: 1.0.0 OS Platform: Linux x64 Hardware: Nvidia T4 Program Language: e.g. Python 3.8
Problem description
Results of fastdeploy PPOCRV3 models are less accurate that inference from PPOCR library runtime. I am not able to identify the reason and if that is the expected result.
Hello,
1.When you say PPOCR library runtime, do you mean PaddleOCR? (https://github.com/PaddlePaddle/PaddleOCR/tree/release/2.6/deploy/cpp_infer)
-
About the "less accurate" , I'm not sure how much accuracy you've lost. Could you show me the result from both FastDeploy and PPOCR library runtime? Or Could you show me the image that you predict?
-
When you inference from PaddleOCR, the batch size of recognition model is set to 6, but in FastDeploy, the batch size is not fixed, which will case some difference in accuracy.
@yunyaoXYY you use following to reproduce:
Inference via PaddleOCR
from paddleocr import PaddleOCR,draw_ocr
import cv2
img_path = <jpeg_path>
ocr = PaddleOCR(use_angle_cls=True, lang='en')
ocr(cv2.imread(img_path))
[2022/12/02 01:56:54] ppocr DEBUG: dt_boxes num : 28, elapse : 0.3203616142272949
[2022/12/02 01:56:54] ppocr DEBUG: cls num : 28, elapse : 0.18441128730773926
[2022/12/02 01:56:58] ppocr DEBUG: rec_res num : 28, elapse : 3.2916038036346436
Result of PaddleOCR
{
"R4.. 12:30": 0.7997417449951172,
"( %.00": 0.7055521011352539,
"79": 0.8551740050315857,
"Melbourne - Fleet & Other Vehicles": 0.9827376008033752,
"All": 0.8694536089897156,
"Watches / Notes": 0.9373573064804077,
"206 CP 09/20, Built 09/20, Toyota, Camry": 0.9854691624641418,
"ASV70R Ascent, Sedan, 5 Seats, 4 Doors": 0.9811344146728516,
"35,002Kms": 0.9965813159942627,
"REFER $27,750": 0.9993845820426941,
"207 CP 06/19. Built 04/19.Subaru.Forester. S5": 0.9764477014541626,
"MY19 2.5i-S CVT AWD, Wagon, 5 Seats, 5 Doors": 0.9877890944480896,
"87,290Kms": 0.9946824312210083,
"REFER $31,000": 0.9942302107810974,
"208 CP 10/18,Built 05/18, Kia, Sportage, QL": 0.9710363149642944,
"MY18 Si AWD Premium,Wagon, 5 Seats, 5": 0.9965835809707642,
"Doors": 0.9997485876083374,
"53,910Kms": 0.9935662746429443,
"REFER": 0.999330997467041,
"$24,250": 0.9978954195976257,
"209 CP 07/17,Built 05/17,Holden,Trax,TJ": 0.9638256430625916,
"MY17 LT,Wagon, 5 Seats, 5 Doors": 0.9763248562812805,
"64,631Kms": 0.9987785816192627,
"REFER $16,500": 0.9978040456771851
}
Inference via FastDeploy
import fastdeploy as fd
from fastdeploy import c_lib_wrap as C
import cv2
option = fd.RuntimeOption()
rec_option=option.set_trt_input_shape("x", [1, 3, 48, 10], [10, 3, 48, 320],
[64, 3, 48, 2304])
rec_model = fd.vision.ocr.Recognizer(
'/fastdeploy_models/rec_runtime/1/model.pdmodel',
'/fastdeploy_models/rec_runtime/1/model.pdiparams',
'/fastdeploy_models/rec_postprocess/1/en_dict.txt',
runtime_option=rec_option)
option.is_dynamic=True
option.set_trt_input_shape("x", [1, 3, 64, 64], [1, 3, 640, 640],
[1, 3, 1544, 1544])
det_model = fd.vision.ocr.DBDetector(
'/fastdeploy_models/det_runtime/1/model.pdmodel',
'/fastdeploy_models/det_runtime/1/model.pdiparams', runtime_option=option)
img_path = "<jpeg_path>"
im=cv2.imread(img_path)
sys=C.vision.ocr.PPOCRv3(det_model._model,rec_model._model)
sys.predict(im[...,])
Results of FastDeploy
{
"R 12:30": 0.5980047583580017,
"G": 0.15071693062782288,
"": 0.0,
"Melbourne - Fleet & Other Vehicles": 0.9961614012718201,
"A": 0.7049339413642883,
"Watches I Notes": 0.9241747260093689,
"206 CP 09/20, Built 09/20, Toyota, Camry": 0.9750648140907288,
"ASV70R Ascent, Sedan, 5 Seats, 4 Doors": 0.966465175151825,
"3502Km": 0.5585656762123108,
"REER27 50": 0.6220335364341736,
"207 CP 06/19.Built 04/19.Subaru.Forester. S5": 0.9707357883453369,
"MY19 2.5i-S CVT AWD, Wagon, 5 Seats, 5 Doors": 0.9739432334899902,
"872 Ks": 0.34590622782707214,
"REER3100": 0.6966407895088196,
"208 CP 10/18, Built 05/18, Kia, Sportage, QL": 0.9705865979194641,
"MY18 Si AWD Premium,Wagon, 5 Seats, 5": 0.9961617588996887,
"DOO": 0.5707316994667053,
"53 91K": 0.6855024099349976,
"REFER": 0.6617550253868103,
"$242 50": 0.549923837184906,
"209 CP 07/17,Built 05/17,Holden,Trax,TJ": 0.981033205986023,
"MY17 LT, Wagon, 5 Seats, 5 Doors": 0.9695397615432739,
"64631K": 0.5456709861755371,
"REFO16500": 0.5376154780387878,
"1": 0.18550123274326324,
"L": 0.14091278612613678
}
You can see that there is significant drop in accuracy and detections of the two methods. Following image was used at our end to produce these results.
@yunyaoXYY you use following to reproduce:
Inference via PaddleOCR
from paddleocr import PaddleOCR,draw_ocr import cv2 img_path = <jpeg_path> ocr = PaddleOCR(use_angle_cls=True, lang='en') ocr(cv2.imread(img_path))
[2022/12/02 01:56:54] ppocr DEBUG: dt_boxes num : 28, elapse : 0.3203616142272949 [2022/12/02 01:56:54] ppocr DEBUG: cls num : 28, elapse : 0.18441128730773926 [2022/12/02 01:56:58] ppocr DEBUG: rec_res num : 28, elapse : 3.2916038036346436
Result of PaddleOCR
{ "R4.. 12:30": 0.7997417449951172, "( %.00": 0.7055521011352539, "79": 0.8551740050315857, "Melbourne - Fleet & Other Vehicles": 0.9827376008033752, "All": 0.8694536089897156, "Watches / Notes": 0.9373573064804077, "206 CP 09/20, Built 09/20, Toyota, Camry": 0.9854691624641418, "ASV70R Ascent, Sedan, 5 Seats, 4 Doors": 0.9811344146728516, "35,002Kms": 0.9965813159942627, "REFER $27,750": 0.9993845820426941, "207 CP 06/19. Built 04/19.Subaru.Forester. S5": 0.9764477014541626, "MY19 2.5i-S CVT AWD, Wagon, 5 Seats, 5 Doors": 0.9877890944480896, "87,290Kms": 0.9946824312210083, "REFER $31,000": 0.9942302107810974, "208 CP 10/18,Built 05/18, Kia, Sportage, QL": 0.9710363149642944, "MY18 Si AWD Premium,Wagon, 5 Seats, 5": 0.9965835809707642, "Doors": 0.9997485876083374, "53,910Kms": 0.9935662746429443, "REFER": 0.999330997467041, "$24,250": 0.9978954195976257, "209 CP 07/17,Built 05/17,Holden,Trax,TJ": 0.9638256430625916, "MY17 LT,Wagon, 5 Seats, 5 Doors": 0.9763248562812805, "64,631Kms": 0.9987785816192627, "REFER $16,500": 0.9978040456771851 }
Inference via FastDeploy
import fastdeploy as fd from fastdeploy import c_lib_wrap as C import cv2 option = fd.RuntimeOption() rec_option=option.set_trt_input_shape("x", [1, 3, 48, 10], [10, 3, 48, 320], [64, 3, 48, 2304]) rec_model = fd.vision.ocr.Recognizer( '/fastdeploy_models/rec_runtime/1/model.pdmodel', '/fastdeploy_models/rec_runtime/1/model.pdiparams', '/fastdeploy_models/rec_postprocess/1/en_dict.txt', runtime_option=rec_option) option.is_dynamic=True option.set_trt_input_shape("x", [1, 3, 64, 64], [1, 3, 640, 640], [1, 3, 1544, 1544]) det_model = fd.vision.ocr.DBDetector( '/fastdeploy_models/det_runtime/1/model.pdmodel', '/fastdeploy_models/det_runtime/1/model.pdiparams', runtime_option=option) img_path = "<jpeg_path>" im=cv2.imread(img_path) sys=C.vision.ocr.PPOCRv3(det_model._model,rec_model._model) sys.predict(im[...,])
Results of FastDeploy
{ "R 12:30": 0.5980047583580017, "G": 0.15071693062782288, "": 0.0, "Melbourne - Fleet & Other Vehicles": 0.9961614012718201, "A": 0.7049339413642883, "Watches I Notes": 0.9241747260093689, "206 CP 09/20, Built 09/20, Toyota, Camry": 0.9750648140907288, "ASV70R Ascent, Sedan, 5 Seats, 4 Doors": 0.966465175151825, "3502Km": 0.5585656762123108, "REER27 50": 0.6220335364341736, "207 CP 06/19.Built 04/19.Subaru.Forester. S5": 0.9707357883453369, "MY19 2.5i-S CVT AWD, Wagon, 5 Seats, 5 Doors": 0.9739432334899902, "872 Ks": 0.34590622782707214, "REER3100": 0.6966407895088196, "208 CP 10/18, Built 05/18, Kia, Sportage, QL": 0.9705865979194641, "MY18 Si AWD Premium,Wagon, 5 Seats, 5": 0.9961617588996887, "DOO": 0.5707316994667053, "53 91K": 0.6855024099349976, "REFER": 0.6617550253868103, "$242 50": 0.549923837184906, "209 CP 07/17,Built 05/17,Holden,Trax,TJ": 0.981033205986023, "MY17 LT, Wagon, 5 Seats, 5 Doors": 0.9695397615432739, "64631K": 0.5456709861755371, "REFO16500": 0.5376154780387878, "1": 0.18550123274326324, "L": 0.14091278612613678 }
You can see that there is significant drop in accuracy and detections of the two methods. Following image was used at our end to produce these results.
Got it, I will help to check this problem
@yunyaoXYY I tried the same on your sample image and got the following results:
{
"FastDeploy": {
"LRE": 0.3414466977119446,
"tJYj 155": 0.5437671542167664,
"1L4": 0.4430263042449951,
"T252935": 0.9433053135871887
},
"PaddleOCR": {
"15": 0.8533188104629517,
"252935": 0.9908719658851624
}
}
Sample Image
@yunyaoXYY I tried the same on your sample image and got the following results:
{ "FastDeploy": { "LRE": 0.3414466977119446, "tJYj 155": 0.5437671542167664, "1L4": 0.4430263042449951, "T252935": 0.9433053135871887 }, "PaddleOCR": { "15": 0.8533188104629517, "252935": 0.9908719658851624 } }
Sample Image
I think if you wanna try to use PaddleOCR on such picture whit Chinese, you should change the label to : https://gitee.com/paddlepaddle/PaddleOCR/raw/release/2.6/ppocr/utils/ppocr_keys_v1.txt
@yunyaoXYY you use following to reproduce:
Inference via PaddleOCR
from paddleocr import PaddleOCR,draw_ocr import cv2 img_path = <jpeg_path> ocr = PaddleOCR(use_angle_cls=True, lang='en') ocr(cv2.imread(img_path))
[2022/12/02 01:56:54] ppocr DEBUG: dt_boxes num : 28, elapse : 0.3203616142272949 [2022/12/02 01:56:54] ppocr DEBUG: cls num : 28, elapse : 0.18441128730773926 [2022/12/02 01:56:58] ppocr DEBUG: rec_res num : 28, elapse : 3.2916038036346436
Result of PaddleOCR
{ "R4.. 12:30": 0.7997417449951172, "( %.00": 0.7055521011352539, "79": 0.8551740050315857, "Melbourne - Fleet & Other Vehicles": 0.9827376008033752, "All": 0.8694536089897156, "Watches / Notes": 0.9373573064804077, "206 CP 09/20, Built 09/20, Toyota, Camry": 0.9854691624641418, "ASV70R Ascent, Sedan, 5 Seats, 4 Doors": 0.9811344146728516, "35,002Kms": 0.9965813159942627, "REFER $27,750": 0.9993845820426941, "207 CP 06/19. Built 04/19.Subaru.Forester. S5": 0.9764477014541626, "MY19 2.5i-S CVT AWD, Wagon, 5 Seats, 5 Doors": 0.9877890944480896, "87,290Kms": 0.9946824312210083, "REFER $31,000": 0.9942302107810974, "208 CP 10/18,Built 05/18, Kia, Sportage, QL": 0.9710363149642944, "MY18 Si AWD Premium,Wagon, 5 Seats, 5": 0.9965835809707642, "Doors": 0.9997485876083374, "53,910Kms": 0.9935662746429443, "REFER": 0.999330997467041, "$24,250": 0.9978954195976257, "209 CP 07/17,Built 05/17,Holden,Trax,TJ": 0.9638256430625916, "MY17 LT,Wagon, 5 Seats, 5 Doors": 0.9763248562812805, "64,631Kms": 0.9987785816192627, "REFER $16,500": 0.9978040456771851 }
Inference via FastDeploy
import fastdeploy as fd from fastdeploy import c_lib_wrap as C import cv2 option = fd.RuntimeOption() rec_option=option.set_trt_input_shape("x", [1, 3, 48, 10], [10, 3, 48, 320], [64, 3, 48, 2304]) rec_model = fd.vision.ocr.Recognizer( '/fastdeploy_models/rec_runtime/1/model.pdmodel', '/fastdeploy_models/rec_runtime/1/model.pdiparams', '/fastdeploy_models/rec_postprocess/1/en_dict.txt', runtime_option=rec_option) option.is_dynamic=True option.set_trt_input_shape("x", [1, 3, 64, 64], [1, 3, 640, 640], [1, 3, 1544, 1544]) det_model = fd.vision.ocr.DBDetector( '/fastdeploy_models/det_runtime/1/model.pdmodel', '/fastdeploy_models/det_runtime/1/model.pdiparams', runtime_option=option) img_path = "<jpeg_path>" im=cv2.imread(img_path) sys=C.vision.ocr.PPOCRv3(det_model._model,rec_model._model) sys.predict(im[...,])
Results of FastDeploy
{ "R 12:30": 0.5980047583580017, "G": 0.15071693062782288, "": 0.0, "Melbourne - Fleet & Other Vehicles": 0.9961614012718201, "A": 0.7049339413642883, "Watches I Notes": 0.9241747260093689, "206 CP 09/20, Built 09/20, Toyota, Camry": 0.9750648140907288, "ASV70R Ascent, Sedan, 5 Seats, 4 Doors": 0.966465175151825, "3502Km": 0.5585656762123108, "REER27 50": 0.6220335364341736, "207 CP 06/19.Built 04/19.Subaru.Forester. S5": 0.9707357883453369, "MY19 2.5i-S CVT AWD, Wagon, 5 Seats, 5 Doors": 0.9739432334899902, "872 Ks": 0.34590622782707214, "REER3100": 0.6966407895088196, "208 CP 10/18, Built 05/18, Kia, Sportage, QL": 0.9705865979194641, "MY18 Si AWD Premium,Wagon, 5 Seats, 5": 0.9961617588996887, "DOO": 0.5707316994667053, "53 91K": 0.6855024099349976, "REFER": 0.6617550253868103, "$242 50": 0.549923837184906, "209 CP 07/17,Built 05/17,Holden,Trax,TJ": 0.981033205986023, "MY17 LT, Wagon, 5 Seats, 5 Doors": 0.9695397615432739, "64631K": 0.5456709861755371, "REFO16500": 0.5376154780387878, "1": 0.18550123274326324, "L": 0.14091278612613678 }
You can see that there is significant drop in accuracy and detections of the two methods. Following image was used at our end to produce these results.
Hi, I found one reason why there exists difference in accuracy is the batch size of FastDeploy PP-OCR is not equal to that of PaddleOCR. We will fix it and let user set the batch size of PP-OCR in FastDeploy, which will make the accuracy match that of PaddleOCR.
您好,请问您这个问题最后得到解决了吗