moondream
moondream copied to clipboard
How to run on llama.cpp
I see the support for llama.cpp, but I don't know how to run moondream2
You'll need llama.cpp
compil binaries of llama.cpp to get llava-cli :
git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
export LLAMA_CUDA=1 # only if for NViDiA CUDA
make -j$(nproc)
Launch llava-cli like :
/whatever/llama.cpp/llava-cli -m /whatever/moondream2-text-model-f16.gguf --mmproj /whatever/moondream2-mmproj-f16.gguf --image /whatever/picture.jpg -p "describe the image" --temp 0.1 -c 2048
Works also with LM_Studio, just create a directory moondream2 with the two gguf files in your local model directory, mine is /home/alioune/LMStudio/models/alioune/local/ Use alpaca preset, set temp to 0.1, upload a picture, prompt for "describe image" ... profit!
Just gave moondream2 a test and it performs horrible on llama.cpp. Also in the prompt, you need to specify a template as moondream2 is not covered automatically.
My assumption is that there are significant differences in preprocesing or projection.
For more info look here: https://github.com/ggerganov/llama.cpp/issues/8037
This also includes the template to use with llava-cli
Any news on this?