textbsr icon indicating copy to clipboard operation
textbsr copied to clipboard

Does textbsr support Japanese or other languages?

Open Ken1256 opened this issue 2 years ago • 7 comments

Does textbsr support Japanese or other languages?

Ken1256 avatar Jun 07 '23 23:06 Ken1256

Does textbsr support Japanese or other languages?

Hi, this model is trained using Chinese and English characters, without any semantic constraints. So if the degradation is not severe, this model can also perform well (see https://imgsli.com/MTg0NjU0). Note that I use cnstd for text region detection, which is also trained for English and Chinese text detection. It is better to use the text detection model according to the language.

csxmli2016 avatar Jun 08 '23 02:06 csxmli2016

Can you add a Japanese text detection to textbsr? If textbsr only changes the text area, you can use text detection after textbsr and the mask will be more accurate.

Ken1256 avatar Jun 08 '23 06:06 Ken1256

Reference in ne

Thanks for your kind suggestion. I am not familiar with Japanese text detection, so this may remain to other users who are interested in transferring this framework to other languages. Generally, if the text detection is not accurate, that means this text region may have lower image quality, which may also fail to restore using our textbsr. I understand your proposed pipeline. This seems not suitable for this simple framework.

csxmli2016 avatar Jun 08 '23 06:06 csxmli2016

I tried EasyOCR for Japanese text detection and the results were impressive, maybe you can add to textbsr. https://github.com/JaidedAI/EasyOCR test01 test02

Ken1256 avatar Jun 08 '23 07:06 Ken1256

I tried EasyOCR for Japanese text detection and the results were impressive, maybe you can add to textbsr. https://github.com/JaidedAI/EasyOCR test01 test02

Wow, it is great! The performance is impressive. I will incorporate it into this framework by the end of this month. Thanks.

csxmli2016 avatar Jun 08 '23 07:06 csxmli2016

This is a great package. I'd like to use this for the Korean language - do you have any training code I could use with a custom dataset? Korean is supported by easyOCR and paddleOCR as well.

skunkwerk avatar Oct 20 '23 23:10 skunkwerk

This is a great package. I'd like to use this for the Korean language - do you have any training code I could use with a custom dataset? Korean is supported by easyOCR and paddleOCR as well.

No problem. You can send me the request to my email [email protected].

csxmli2016 avatar Oct 21 '23 01:10 csxmli2016