[Feature]: Is it possible to add an OCR service based on LLM?
Description
Currently, OCR accuracy based on LLMs (such as Gemini pro 2.5, Mistral AI OCR) is far superior to traditional OCR models based on deep learning technology. Can you add interfaces for multiple major LLM OCRs?
Application Scenario
The accuracy of LLM OCR is higher.
References
No response
there are some plugin can server your purpose https://github.com/pot-app/pot-app-plugin-list/blob/main/README.md#%E6%A8%A1%E6%9D%BF At least I tried the qwen-VL plugin.
there are some plugin can server your purpose https://github.com/pot-app/pot-app-plugin-list/blob/main/README.md#%E6%A8%A1%E6%9D%BF At least I tried the qwen-VL plugin.
Thank you for your guidance. I’d prefer to use Mistral OCR. And this Qwen plugin doesn't connect via an API; instead, it uses browser cookies. I'm not very fond of this non-standard approach.