Zaibei Li comments

Repositories
Issues
Comments

Results 2 comments of


                                            Zaibei Li

Difference in performance between liuhaotian/llava-v1.6-34b and Ollama's llava:34b-v1.6

It is not a bug, as the demo uses fp16 precision while the llava:34b-v1.6 is in 4 bit quantization. Thus, the performance is not comparable.

Difference in performance between liuhaotian/llava-v1.6-34b and Ollama's llava:34b-v1.6

Sorry i couldn't confirm if it is bf16 or fp16. But based on the recommended VRAM from [llava](https://github.com/haotian-liu/LLaVA) - 80G, versus 69G in [Ollama](https://ollama.com/library/llava:34b-v1.6-fp16), it might be bf16.