mobeetle

Results 2 comments of mobeetle

Hi, thanks for checking. When I recreate your test, it is working. The problem seems to be when using JSON schema with 0.2.84/5. Working with 0.2.83. Please find attached Jupyter...

No problem, here it is: from llama_cpp import Llama from llama_cpp.llama_speculative import LlamaPromptLookupDecoding model_path = "/Users/macmacmac/Documents/CODING/models/Hermes-2-Pro-Mistral-7B.Q4_K_M.gguf" model = Llama( model_path=str(model_path), draft_model=LlamaPromptLookupDecoding( num_pred_tokens=13, max_ngram_size=9 ), n_ctx=8192, n_batch=128, last_n_tokens_size=128, n_gpu_layers=-1, f16_kv=True, offload_kqv=True,...