vllm icon indicating copy to clipboard operation
vllm copied to clipboard

[Bug]: deepseek_v2 236B on 8XA100 wrong output vllm==0.5.4

Open shuailong616 opened this issue 1 year ago • 5 comments

Your current environment

wrong output Prompt: 'Funniestjoke ever:',generated text: '!!!!!!!!!!!!!!!!!!' Prompt: The capital of France is:',generated text: '!!!!!!!!!!!!!!!!!!' Prompt: 'The future of AI is:',generated text: '!!!!!!!!!!!!!!!!!!'

🐛 Describe the bug

from llvm import LLM, SamplingPatams import argpase import torch

def generate(args, prompts): sampling_params = SamplingParams(temperature=0.8, top_k=1, max_tokens=20) llm = LLM(mode=args.model_path, trust_remote_code=True, max_model_len=2048,work_use_ray=True, enforce_eager=True, dtype=torch.half, tensor_parakkek_size=8, enable_chunked_prefill=False) outputs = llm.gennrate(prompts, sampling_params) for output in outputs: prompt = output.prompt generated_text = output.output[0].text print(f"prompt: {prompt!r} Generated text {generated_text!r}") if name == "main": parse = argparse.ArgumentParser() parse.add_argument('--model_path',type =str) prompts = [ "Funniestjoke ever:", "The capital of France is", "The future of AI is", ] generate(args, prompts)

Before submitting a new issue...

  • [X] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

shuailong616 avatar Sep 09 '24 04:09 shuailong616

a8ddbe6c90a61c3c23d06b21a5036607

shuailong616 avatar Sep 10 '24 01:09 shuailong616

@shuailong616 Hi! Can you try swapping the dtype=torch.half to dtype=torch.bfloat16?

dsikka avatar Sep 11 '24 23:09 dsikka

@shuailong616 Hi! Can you try swapping the dtype=torch.half to dtype=torch.bfloat16?

thank you so much for your replay , use torch.bfloat16 have a currect result.

shuailong616 avatar Sep 12 '24 01:09 shuailong616

Uploading iwEeAqNwbmcDAQTRBN0F0QDwBrCTfCCdbk_4pQbKkrYcZP4AB9ITCQwDCAAJomltCgAL0gACCvw.png_720x720q90.jpg…

shuailong616 avatar Sep 12 '24 01:09 shuailong616

@shuailong616 Hi! Can you try swapping the dtype=torch.half to dtype=torch.bfloat16?

Hi! I also want to konw why torch.half result wrong reason . looking fowward to your reply

shuailong616 avatar Sep 12 '24 02:09 shuailong616

@shuailong616 of course you need to match the dtype of the original model. models can be sensitive to dtypes.

youkaichao avatar Sep 20 '24 17:09 youkaichao