LLaVA icon indicating copy to clipboard operation
LLaVA copied to clipboard

[Usage] Inconsistent OCR Results with LLaVA 1.6 and Ollama vs. Online Demo

Open arcaweb-ch opened this issue 1 year ago • 3 comments

Describe the issue

Issue:

I've been testing LLaVA 1.6 with Ollama for OCR tasks and noticed that the online demo at https://llava.hliu.cc consistently outperforms my local tests, despite using identical prompts and parameters. This discrepancy makes me wonder if there's a difference in implementation or configurations between the online demo and the local version I'm using.

Could you provide any insights into this matter or suggest how to achieve parity with the demo's results?

Thanks for your help.

Reference image: example from wikipedia

Prompt:

find the total in the receipt

arcaweb-ch avatar Feb 10 '24 16:02 arcaweb-ch

ollama uses a non-optimal version of llama.cpp to convert and use llava 1.6, this PR should solve the problem.

wrapss avatar Feb 10 '24 16:02 wrapss

ollama uses a non-optimal version of llama.cpp to convert and use llava 1.6, this PR should solve the problem.

Thanks, waiting for it.

arcaweb-ch avatar Feb 11 '24 11:02 arcaweb-ch

This may be related, as well? https://github.com/haotian-liu/LLaVA/issues/1497

ChristianWeyer avatar May 17 '24 09:05 ChristianWeyer