Qwen3 icon indicating copy to clipboard operation
Qwen3 copied to clipboard

Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.

Results 450 Qwen3 issues
Sort by recently updated
recently updated
newest added

You have update the weights of `Qwen1.5-14B-Chat-GPTQ-Int4` and intermediate_size from 14436 to 14336 about 12 days ago. It seems that is the int4 version is not quantized directly from `Qwen1.5-14B-Chat`...

torchrun $DISTRIBUTED_ARGS finetune.py \ --model_name_or_path $MODEL \ --data_path $DATA \

`model_path = "./model/qwen1_5-1_8b" model = AutoModelForCausalLM.from_pretrained( model_path, torch_dtype="auto", device_map="auto" ) tokenizer = AutoTokenizer.from_pretrained(model_path) prompt = '苹果是什么颜色' messages = [ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content":...

Ref #269 #264, the new 32B model outputs "!!!!!!" tokens when deployed on vLLM. However, a slight tweak in the system prompt seem to address the issue. See below: ##...

input: ``` { "model": "qwen1.5-72b-chat", "temperature": 0, "maxTokens":8000, "stream": "false", "messages": [ { "role": "system", "content": "Translate everything into Simplified Chinese. Please only include the translation result." }, {"role": "user",...

### 是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this? - [X] 我已经搜索过已有的issues和讨论 | I have searched the existing issues / discussions ### 该问题是否在FAQ中有解答? | Is there an...

vllm version: 0.4.0.post1 code: ```python from vllm import LLM import os os.environ["VLLM_USE_MODELSCOPE"] = "True" os.environ["CUDA_VISIBLE_DEVICES"] = "0" llm = LLM( model="qwen/Qwen1.5-32B-Chat-GPTQ-Int4", trust_remote_code=True, gpu_memory_utilization=0.6, ) output = llm.generate( "system\n" "You are...

当输入token比较短(约50以内)时,logprob变为nan,输出token变为0,表现为输出为'!!!!...',且不会停止,输出到上限为止。 长文本正常 awq正常

请问如何批量推理呢