Dougie777

Results 5 issues of Dougie777

I dont have time to fork this so I will just post the fixed code here and a link to a working jsfiddle. UPDATE. I added the ability to put...

For example openchat 3.5 wants this prompt template format: GPT4 User: {prompt}GPT4 Assistant: I tried a few things a managed to crash the server so I am stuck. Can anyone...

Could there be some new format of gguf that we need to update the code for or something?

How to reproduce: **1) Model being used:** wizardlm_70b_q4_gguf = LlamaCppModel( model_path="wizardlm-70b-v1.0.Q4_K_M.gguf", # manual download max_total_tokens=4096, use_mlock=False, ) **2) From swagger run this query against the chat completion endpoint. Please note...

error loading model: llama.cpp: tensor 'layers.0.attention.wk.weight' has wrong shape; expected 8192 x 8192, got 8192 x 1024 The exact same settings and quantization works for 7B and 13B. Here is...