liulfy
liulfy
@WoosukKwon Thank you for answering my problem! When I try the swap_space, the problem has not been solved. my code is here: from vllm import LLM model_path = 'yahma/llama-13b-hf' llama_model...
> > Me too. May be the Ray memory monitor detected memory usage incorrectly ? because I found there were a lot of memory occupied by system buffer/cache, and Ray...
I also have the same problem.
Indeed. I built from source: https://github.com/vllm-project/vllm/releases/tag/v0.1.1, and this problem solved.