YomiNinja
YomiNinja copied to clipboard
Better veritical text recognition
I've played around with this tool a bit and ive noticed it's got alot of potential, ive tried games and for the most part it worked flawlessly even OUTSIDE of games because its an ocr litteraly anything with text can be mined. Which i think that's what gives this tool so much potential However, I've noticed vertical text gives me a weird problem. The ocr seems to get confused when it comes to vertical text and its kinda buggy so i think that should be fixed. But thank you for the amazing tool you're doing an amazing job!
Thank you for your kind words and feedback! I've noticed some challenges, especially when dealing with manga and similar formats. To better address the issue, could you please provide an example of a problematic result? This will help me determine whether the issue arises from the OCR process itself or from the subsequent presentation of the results. If possible, also provide the original image (without the overlay) so I can use it for testing.
Yes ofc, for this i just grabbed a random manga but it can get alot worse with alot of text
as you can see the ocr is getting confused and is trying to go line by line like horizontal text and the ocr boxes are overlapping. This makes it much more harder to mine. But if your determined enough you can mine it. However this is only with these types. The others you couldn't even if you wanted to, It would of been much more effective if it just used one box
To show an example of just how bad it can get with more text, Check this one down below
These are unmineable for the most part especially bottom right I've provided the orginal images on imgur right here https://imgur.com/a/gSkl4Hv I hope this helps with making the tool better!
Your examples were very helpful. Thank you for sharing! I've made some improvements to the overlay, and the current results are as follows:
I still don't think it is currently suitable for manga, as further improvements are still needed for the current OCR engine. Once additional OCR engines are integrated, which will happen very soon, the results should be significantly better.
Wow, while not a 100% perfect i think thats a significate improvement from before and yeah i agree. i will be waiting to see what you got in plan for the future!
Update: The current v0.7.2 offers better vertical text recognition. However, significant improvements are coming in v0.8.
v0.7.2 - Manga OCR+ Paddle Text Detector
v0.8 - Manga OCR + Comic Text Detector
v0.7.2 - Google Lens
v0.8 - Google Lens