PaddleOCR icon indicating copy to clipboard operation
PaddleOCR copied to clipboard

Training own language

Open JonhSilver opened this issue 2 years ago • 0 comments

Discussed in https://github.com/PaddlePaddle/PaddleOCR/discussions/8090

Originally posted by JonhSilver October 25, 2022 How to train own language? Is there any normal tutorial, step by step, without passes? this one not working https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/README.md#language_requests not work too , as no [ppocr/utils/corpus] folder existing at all https://github.com/PaddlePaddle/PaddleOCR/issues/1048

it is impossible to understand from the existing documentation, im trying PaddleOcr for 1 month, and still cant find any useful info:

  1. What is diffirent Paddle from PaddlePaddle from PaddleOCR from PPStructure?
  2. Train models same or differ PaddleOCR from PPStructure?
  3. How to disable this shit in results? image
  4. How to train Lithuanian language to PaddleOCR and PPStructure?
  5. PaddleOCR and PPStructure can OCR pdfs or always need to convert pdfs to images?
  6. How to recognize handwriten historical document text?
  7. What for PP-OCR Series Model List, where and when used?
  8. yml file contains data_dir: ./train_data but i cant find this dir anywhere
  9. yml file contains many options, what for each of option? where i can found documentation?
  10. how to use PaddleOCR and PPStructure with multi language? lang="en,fr" not working

JonhSilver avatar Oct 25 '22 11:10 JonhSilver