mindocr icon indicating copy to clipboard operation
mindocr copied to clipboard

feat: add CAN model

Open zhangjunlongtech opened this issue 1 year ago • 0 comments

Thank you for your contribution to the MindOCR repo. Before submitting this PR, please make sure:

Motivation

  • MindOCR does not currently support handwritten mathematical formula recognition. I hope to solve this problem by contributing a CAN model.

  • HMER(Handwritten Mathematical Expression Recognition) mostly uses the encoder-decoder mechanism. However, when identifying long or complex formulas, it cannot guarantee the accuracy of the region of interest of the Attention module. CAN(Counting-Aware Network) utilizes Multi-Scale Counting Module to improve the accuracy of formula recognition, by introducing counting vectors that can provide global information and spatial position codes that can provide position information.

  • The CAN model consists of the backbone and head modules. For details about the backbone module, see rec_densenet.py. For details about the head module, see rec_can_head.py.

Test Plan

  • The model unit test file test_can_model has been provided. This file sets model parameters and test case parameters.
  • The test file tests the BaseModel class in mindocr/models/base_model.py. The shape of the tensor in the test result meets the expectation.

image

  • To verify the output of backbone, the build_backbone method in mindocr/models/backbones/builder.py is tested based on rec_densenet in CAN model. The tested tensor shape meets the requirements of the paper.

image

  • Tested build_head method in mindocr/models/heads/builder.py based on backbone's output.

image

Related Issues and PRs

Related Issues: Model CAN Model Support

zhangjunlongtech avatar Aug 12 '24 15:08 zhangjunlongtech