jidoujisho [Feature suggestion] Consider using manga-ocr

Nice work on jidoujisho, really useful. Recently I've been experimenting with a new OCR, called manga-ocr, and in my experience, it has more than 99% precision rate. It's really out of this world.

Example (the page selected was totally random):

https://user-images.githubusercontent.com/25280488/153520966-be319a80-228a-4a7e-a783-807635771abd.mp4

Feb 11 '22 00:02 WilsonNet

I'm looking into this, will have to hook it into Chaquopy if I ever make use of it since it uses Python.

Feb 13 '22 11:02 arianneorpilla

Hi, manga-ocr author here. I'm glad to see the interest in this project, and would love to see it integrated with jidoujisho!

One alternative to Chaquopy would be to export model to onnx format and run on Android using onnxruntime.

https://huggingface.co/docs/transformers/serialization https://onnxruntime.ai/docs/tutorials/mobile/

The hard part here is the onnx export, I think it should be possible, but might be tricky. I was meaning to try it myself at some point, but I'm not sure when I will find time for that. There's also some Python logic which would need to be ported, but it's rather lightweight so it shouldn't be a problem.

Just wanted to mention this, I don't actually know if it's a better option than Chaquopy (probably depends on how well Chaquopy can deal with Pytorch and Huggingface dependencies).

Feb 14 '22 20:02 kha-white

Hey, @kha-white. I was thinking of contacting you about this by e-mail, then I remembered that you had already reached out some time ago.

I cut the Viewer from my 2.0 release because I wanted to try integrating your work and using it on Android but I had trouble running it with Chaquopy (there are some issues in their repo related to fugashi and pyclippers (this was the one that errored out for me ultimately).

To be honest I'm not at all literate or have ever gotten hands on with machine learning software (though I'd like to use this opportunity to do so). If you can spare time, I'd like to work together to make this a possibility.

Sep 07 '22 09:09 arianneorpilla

Would really love to see this in the app!

Would it be hard to implement the screen cut and paste feature like in manga-ocr, supposing that ocr api is working already?

Sep 14 '22 15:09 PainterHalver

The hard part here is the onnx export

Export of VisionEncoderDecoder models to ONNX has now been merged to huggingface.

https://github.com/huggingface/transformers/pull/19254

I ran the following command and the files were produced without error:

python -m transformers.onnx --model=models--kha-white--manga-ocr-base/ --feature=vision2seq-lm onnx/ --atol 1e-3

I haven't tried doing inference yet, but it seems feasible.

Dec 04 '22 02:12 mathewthe2

Yeah I've tried that too, export seems fine, but inference is not as straightforward since you need to do a beam search or something similar. Definitely doable, but requires some work.

Dec 04 '22 12:12 kha-white

I was wondering, would this or this be of any use? I have a lot of spare time at the moment to study up on this, but ML isn't my field at all though, I could use some guidance on what I might need to get this to work.

Jan 09 '23 17:01 arianneorpilla

Closing as per #75.

Apr 02 '23 11:04 arianneorpilla

jidoujisho jidoujisho copied to clipboard

[Feature suggestion] Consider using manga-ocr

jidoujisho
jidoujisho copied to clipboard