wizd comments

Results 37 comments of


                                            wizd

[Bug]: Unable to run vllm docker container with 2 4090

Sorry, this is bug of WSL2. after migrate to Ubuntu, all is OK! Thank you for such a powerful inference engine!

Can't set system prompt for model openbuddy-mixtral-8x7b-v15.1-2.4bpw-h6-exl2-2 using openai API

a temp walkaround: /extensions/openai/completions.py, line 78: ``` def process_parameters(body, is_legacy=False): + if 'temperature' in body and body['temperature'] == 0: + body['temperature'] = 0.1 ``` sorry, this walkaround is not working,...

Can't set system prompt for model openbuddy-mixtral-8x7b-v15.1-2.4bpw-h6-exl2-2 using openai API

My latest tests for openbuddy-mixtral-8x7b-v15.1-2.4bpw-h6-exl2-2: this one works: ``` curl -X POST \ http://localhost:5000/v1/chat/completions \ -H 'Content-Type: application/json' \ -d '{"model":"openbuddy-mixtral-8x7b-v15.1-2.4bpw-h6-exl2-2","temperature":0.1,"messages":[{"role":"user","content":"请翻译为简体中文（避免解释原文）:\n\n Hello world"}]}' ``` this one gets long giberish outputs:...

Can't set system prompt for model openbuddy-mixtral-8x7b-v15.1-2.4bpw-h6-exl2-2 using openai API

upgrade model to openbuddy-mixtral-8x7b-v15.2-2.4bpw-h6-exl2-2, and use the temperature hack, finally get all openai api clients works: ``` def process_parameters(body, is_legacy=False): if 'temperature' in body and body['temperature'] == 0: print("set temperature...

benchmarks?

M1 with 7B model: 94.24 ms per token M1 with 13B model: 202.18 ms per token speed with command line config -t 4. If use -t 8, half speed.

bad generation on multi-GPU setup

I got this error with ollama/ollama:0.1.22-rocm and dolphin-mixtral:8x7b-v2.6.1-q3_K_M

Compare to Open AI's function calling

It appears to be a comprehensive open-source alternative to OpenAI's function call. I will attempt to integrate TypeChat with this repository (https://github.com/JohannLai/openai-function-calling-tools) in order to make the LLAMA 2 model...

Unicode support

> Nice find! Due to the constently changing encoding history of CJK (Chinese, Japanese, Korean), there is big chance that the training model got wrong encoding of non-ascii language. Simply...

Unicode support

Some more test shows that we can't simply remove the unprintable token. There should be some way to find the right encoding of it. otherwise the generated text becomes unreadable.

Unicode support

Thank you. I build your > Im not sure you have applied the code change. I cannot try your prompt since its an image mind pasting? But I think you...