Avalon
Results
2
comments of
Avalon
I suffered the issue here. The web demo version ran this "llava-v1.6-34b" version of llava. Given the "34b-v1.6" version of model in Ollama, the results of queries into these two...
Even if I can cache it, the inference time will still be increased given the necessary example prompt, right?