Hamed Emine comments

Results 5 comments of


                                            Hamed Emine

llama3 instruct models need multiple `eos_token_id` to make the output stop correctly

Fixed GGUF models here: https://huggingface.co/AI-Engine/Meta-Llama-3-8B-Instruct-GGUF/tree/main

llama3 instruct models need multiple `eos_token_id` to make the output stop correctly

> > For API I had to manually insert in completions.py the fields: 'skip_special_tokens': False, 'custom_stopping_strings': '""' > > as the other side doesnt insert those fields. I think the...

Message "CUDA extension not installed" but CUDA 12 is installed on Windows

I have the same issue here

Message "CUDA extension not installed" but CUDA 12 is installed on Windows

Hello, I was able to resolve this by using "ExLlamav2_HF" as the loader instead of "GPTQ-for-LLaMa", make sure to click Save Settings so it uses that next time it launches.

How can I change web port 7860 to another port?

Edit CMD_FLAGS.txt in the root folder Add this line: --listen --listen-port=1234 (Change 1234 to the port of your choosing) Here is an example of CMD_FLAGS.txt ``` # Only used by...