Zhangchi Feng
Zhangchi Feng
Hello, did you solve this problem? I tried `export TORCH_CUDA_ARCH_LIST="compute capability" ` but it can't work either.
> > > Hello, did you solve this problem? I tried `export TORCH_CUDA_ARCH_LIST="compute capability" ` but it can't work either. > > > > > > Hi, I also met...
This API is the serving endpoint for inference, rather than for finetuing. If you want to fine-tune, one example could be found at examples/lora_single_gpu or other examples. In the README,...
Hello, thank you for your great work!! I have applied for the COCO-CN dataset but received no reply. my email: [email protected]
The model directory is like follows 
But this code seems to request huggingface website ```python from vllm import LLM, SamplingParams prompts = [ "你好,请介绍一下你自己", "中国的定义是什么?" ] sampling_params = SamplingParams(temperature=0.01, top_p=0.01) llm = LLM(model="/path/to/llama") ```
I use **Build from source** and run seccessfully. Thanks! But the **Install with pip** seems to have some problems
[https://vllm.readthedocs.io/en/latest/getting_started/installation.html](https://vllm.readthedocs.io/en/latest/getting_started/installation.html) just use the **Build from source** in this website and run like above, if it only report error, then set **tokenizer_mode='slow'**
> Indeed. I built from source: https://github.com/vllm-project/vllm/releases/tag/v0.1.1, and this problem solved. Yes, this version works