llama.cpp icon indicating copy to clipboard operation
llama.cpp copied to clipboard

Attempt to add the `mllama` support

Open q82419 opened this issue 10 months ago • 3 comments

Motivation

This PR attempts to add the mllama support from the Ollama github into examples of this repository.

All code changes are mainly from the llama patch, operator patch, and mllama implement of the ollama repo.

Goals

  • [x] Mllama implementation (similar to clip in llava)
  • [ ] Model converter of llama-3.2-vision to mllama
  • [ ] Full mllama example and document (such as the example of llava)
  • [x] unpad operation supporting
  • [x] Mllama model build and load in llama.cpp

Current Status

There are still some issues for this implementation.

  1. Model converter. The example model and projection are not on the huggingface.

    Currently I use the ollama application to fetch the converted model for testing.

  2. The n_vocab (n_tokens loaded from model) is mismatch with the tensor dimension.

    The n_tokens is 128257, the dimension of LLM_TENSOR_OUTPUT for example is 128256. It seems like something wrong in the converted model.

  3. As mentioned in 2., some assertion will fail when executing the mllama models.

    ggml_backend_tensor_get_async and ggml_backend_tensor_get will fail in the tensor-read-out-of-bound checking.

q82419 avatar Feb 04 '25 01:02 q82419

Thank you for the PR!

There is currently work in progress to introduce a new vision api, and along side this work there has been work on supporting mllama (Llama 3.2 Vision Instruct). Regarding the vocab issue we've had a disussion about this matter which might be of interest.

danbev avatar Feb 04 '25 05:02 danbev

Thank you for the PR!

There is currently work in progress to introduce a new vision api, and along side this work there has been work on supporting mllama (Llama 3.2 Vision Instruct). Regarding the vocab issue we've had a disussion about this matter which might be of interest.

Thanks for the information! I'll study this to improve.

q82419 avatar Feb 04 '25 12:02 q82419

May I ask if mllama could be compiled in this PR? I didn't see the relevant CMakeLists.

sgwhat avatar Feb 27 '25 10:02 sgwhat

Sorry that we would close this PR due to give up supporting mllama.

q82419 avatar Aug 25 '25 09:08 q82419