Killed
Hello, I keep on getting this when i run the model
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Killed
How can I solve this?
same question. did you resolved?
Not Yet.
Pretty sure this is the OOM killer kicking in when you run out of RAM. I saw someone mention it needs ~9.7GB of RAM available to run on CPU right now since it's not quantized and PyTorch forces us to run in 32-bit precision on CPU. Adding swap can help in the short run, but we're also working on adding support for 16-bit/quantized inference.
+1 to this issue; it is currently a blocker to using this VLM on a Raspberry Pi.
We have support in llama.cpp now, would recommend using that when running on Raspberry Pi. Can you please try it out and let me know if you run into any issues?