PaddleOCR icon indicating copy to clipboard operation
PaddleOCR copied to clipboard

det_mv3_east hmean ~95%, but very few text instances correctly detected

Open gavinfaimdata opened this issue 2 years ago • 1 comments

  • 系统环境/System Environment:Debian 10, CUDA11.3, Nvidia V100
  • 版本号/Version:paddlepaddle-gpu==2.0.0, PaddleOCR release==2.5 (cloned repo)
  • 运行指令/Command Code:python3 tools/infer/predict_det.py --image_dir="./test_data/test_images" --det_model_dir="./inference/det_mv3_east_3" --det_algorithm="EAST"

I used the det_mv3_east configure file and changed only a few things (aside from pointing the dataset variables to my training/test data): pretrained_model: I used the model provided in the download link on the docs page for this model epochs: 2000 name: cosine (lr scheduler) warmup epochs: 2

The result of my detection model is far less than what I expected it to be. The majority of vertical text instances have the edges cut off, and a significant portion of horizontal text instances do too. Below is an example of a common error.

The hmean of my model is ~95%, but in reality only about 15% of text instances are properly detected. The training/test/val split was done properly. I have 3000 images in my dataset.

I'm wondering if anyone has encountered a similar issue where the accuracy is way lower than expected with paddlepaddle text detection and if anyone has tips they could share on how I can mitigate this issue.

Screenshot from 2022-08-03 16-26-38_cropped

gavinfaimdata avatar Aug 03 '22 20:08 gavinfaimdata

I think the reason for the inaccuracy is related to the aspect ratio of the ground truth bounding boxes. More concisely, shorter words get detected with much higher accuracy than longer words (as they tend to get cut off at each end like in the picture).

gavinfaimdata avatar Aug 04 '22 15:08 gavinfaimdata

The hmean of my model is ~95%, but in reality only about 15% of text instances are properly detected

I think this case is overfitting. It is recommended to add data enhancement and set a larger weight deacy parameter during training

LDOUBLEV avatar Aug 26 '22 07:08 LDOUBLEV