Igor Schlumberger
Igor Schlumberger
Llama3.1 are being updated : https://ollama.com/library/llama3.1/tags and 405B is provided in multiple Quantization (Q2... Q8). @gileneusz Please close this issue to keep issues under 1000.
Why Quantization to 1 Bit (Q1) is Ineffective: Loss of Precision: Quantizing to 1 bit means each weight and activation is represented by a single bit, allowing only two states...
@kozuch interesting, I will try the 405b-instruct-q2_K model on my MacStudio with 192 GB of RAM. I will try to work with it. I've seen that we can ask macOS...
@gileneusz I searched on Google for the error: "llama runner process has terminated: error: done_getting_tensors: wrong number of tensors" and it seems that this issue should be resolved in latest...
@gileneusz I will use Ollama for translation, so I hope that 405B with works well. I will see. Thank you.
I try some prompts and yes, the results are worst with 405b:q3_K_S than 70b_q8_0. I try ollama run llama3.1:405b:q3_K_S and after loads and swaps, I got an Error : Killed....
Hi @olafgeibig There are 3 Llava Models, the biggest is the best. I crop the image you provided to keep only the part with the text in Black and White...
@marksalpeter Which version of the model do you use? It seems like the llava:34b-v1.6 works better. I'm trying to install this version from Hugging Face: https://huggingface.co/llava-hf/llava-v1.6-34b-hf but have to spend...
@olafgeibig On hugging face, did you test with the 7B model? The 34B model works fine with Ollama, so it could be an issue with the 7B model itself.
I did again the test with ollama run llava:13b with version 0.1.32 and I got this answer that is quite good as AI always rewrite the text it recognize thru...