Robert Sinclair
Robert Sinclair
``` added 733 packages from 541 contributors and audited 748 packages in 32.626s 74 packages are looking for funding run `npm fund` for details found 11 vulnerabilities (7 moderate, 4...
Please allow to specify font size and possibly the font face..
### Feature request As of now bitsandbytes allows only to quantize a model all in the same way. This is ok, but I found out that in most cases the...
Hello! I did some research (using llama.cpp) and I found out that quantizing the input and embed tensors to f16 and the other tensors to q5_k or q6_k gives excellent...
if not what is the latest commit supporting clblast?
Hello @ggerganov ! I wish to quantize: openai/whisper-large-v3 in my "usual way". With llama.cpp I usally do: llama-quantize --allow-requantize --output-tensor-type f16 --token-embedding-type f16 model.f16.gguf model.f16.q6.gguf q6_k And I use convert-hf-gguf...
using llama.cpp library I do: `struct llama_model* model = llama_load_model_from_file(input_model_path, params);` How do I save it back to disk in gguf format? I'm asking because I wrote a program to...
I ran llama.cpp (latest version) with these parameters: ``` prompt=""" Tell me a long story. """ ``` `llama-cli --seed 1721414715 -c 4096 -m /content/$m -t $(nproc) -ngl 999 -p "User:...
Model: https://huggingface.co/apple/DCLM-Baseline-7B `python /content/llama.cpp/convert_hf_to_gguf.py --outtype f16 DCLM-Baseline-7B --outfile DCLM-Baseline-7B.f16.gguf` OpenLMModel is not supported
`python /content/llama.cpp/convert_hf_to_gguf.py --outtype f16 SmolLM-1.7B-Instruct --outfile SmolLM-1.7B-Instruct.f16.gguf` ``` INFO:hf-to-gguf:Set model tokenizer WARNING:hf-to-gguf: WARNING:hf-to-gguf:************************************************************************************** WARNING:hf-to-gguf:** WARNING: The BPE pre-tokenizer was not recognized! WARNING:hf-to-gguf:** There are 2 possible reasons for this: WARNING:hf-to-gguf:**...