jan icon indicating copy to clipboard operation
jan copied to clipboard

Fietje 2 model won't download & run

Open jasperslot opened this issue 1 year ago • 2 comments

Describe the bug I have tried to use the Fietje 2 model (https://huggingface.co/BramVanroy/fietje-2-chat-gguf) to run, but it doesn't run. It does work in Ollama (https://ollama.com/bramvanroy/fietje-2b-chat:Q8_0) without issues. There is no clear message why it doesn't start. I have tried by adding the model directly by copy & paste the huggingface url en choose the Q8 model, but it doesn't download (stuck at 0%). So, I downloaded the GGUF manually and imported into Jan. This works, but when I try to start the model, it doesn't start, but I also don't get a clear error message.

Expected behavior I expect the model to run, just like with Ollama.

Environment details

  • Operating System: MacOS Sonoma 14.4.1,
  • Jan Version: 0.5.0
  • Processor: Apple M3 Max
  • RAM: 64GB

Logs app.log

jasperslot avatar Jun 09 '24 15:06 jasperslot

Thanks @jasperslot ! I checked this and got those below error.

jan:dev: 2024-06-10T03:48:39.878Z [CORTEX]::Error: llama_model_loader: - kv  16:                      tokenizer.ggml.merges arr[str,50000]   = ["Ġ t", "Ġ a", "h e", "i n", "r e",...
jan:dev: llama_model_loader: - kv  17:                tokenizer.ggml.bos_token_id u32              = 50295
jan:dev: llama_model_loader: - kv  18:                tokenizer.ggml.eos_token_id u32              = 50296
jan:dev: llama_model_loader: - kv  19:            tokenizer.ggml.unknown_token_id u32              = 50256
jan:dev: llama_model_loader: - kv  20:            tokenizer.ggml.padding_token_id u32              = 50296
jan:dev: llama_model_loader: - kv  21:                    tokenizer.chat_template str              = {% for message in messages %}{{'<|im_...
jan:dev: llama_model_loader: - kv  22:               general.quantization_version u32              = 2
jan:dev: llama_model_loader: - type  f32:  259 tensors
jan:dev: llama_model_loader: - type q8_0:  194 tensors
jan:dev: 
jan:dev: 2024-06-10T03:48:39.910Z [CORTEX]::Error: llama_model_load: error loading model: error loading model vocabulary: unknown pre-tokenizer type: 'phi-2'
jan:dev: llama_load_model_from_file: failed to load model
jan:dev: 
jan:dev: 2024-06-10T03:48:39.912Z [CORTEX]::Error: llama_init_from_gpt_params: error: failed to load model '/Users/_/jan/models/Q8_0/fietje-2b-chat-Q8_0.gguf/fietje-2b-chat-Q8_0.gguf'
jan:dev: 
jan:dev: 2024-06-10T03:48:39.912Z [CORTEX]::Debug: {"timestamp":1717991319,"level":"ERROR","function":"LoadModel","line":168,"message":"llama.cpp unable to load model","model":"/Users/_/jan/models/Q8_0/fietje-2b-chat-Q8_0.gguf/fietje-2b-chat-Q8_0.gguf"}
jan:dev: 20240610 03:48:39.912616 UTC 3911969 ERROR Error loading the model - llama_engine.cc:385

namchuai avatar Jun 10 '24 03:06 namchuai

Adding related ticket: https://github.com/ggerganov/llama.cpp/issues/7219

Van-QA avatar Jun 12 '24 03:06 Van-QA

What are the latest updates on this? Cannot be solved? @Van-QA @namchuai

hantran-co avatar Sep 02 '24 07:09 hantran-co

@jasperslot Seems fixed by llamacpp upstream. Closing Here's my screenshot: image

freelerobot avatar Sep 05 '24 09:09 freelerobot

Hi, can confirm its working in Jan 0.5.3 now 🎉

jasperslot avatar Sep 07 '24 09:09 jasperslot