PaddleOCR
PaddleOCR copied to clipboard
Training own language
Discussed in https://github.com/PaddlePaddle/PaddleOCR/discussions/8090
Originally posted by JonhSilver October 25, 2022 How to train own language? Is there any normal tutorial, step by step, without passes? this one not working https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/README.md#language_requests not work too , as no [ppocr/utils/corpus] folder existing at all https://github.com/PaddlePaddle/PaddleOCR/issues/1048
it is impossible to understand from the existing documentation, im trying PaddleOcr for 1 month, and still cant find any useful info:
- What is diffirent Paddle from PaddlePaddle from PaddleOCR from PPStructure?
- Train models same or differ PaddleOCR from PPStructure?
- How to disable this shit in results?
- How to train Lithuanian language to PaddleOCR and PPStructure?
- PaddleOCR and PPStructure can OCR pdfs or always need to convert pdfs to images?
- How to recognize handwriten historical document text?
- What for PP-OCR Series Model List, where and when used?
- yml file contains data_dir: ./train_data but i cant find this dir anywhere
- yml file contains many options, what for each of option? where i can found documentation?
- how to use PaddleOCR and PPStructure with multi language? lang="en,fr" not working