liulfy comments

Repositories
Issues
Comments

Results 5 comments of


                                            liulfy

how to connect to the single server database with pyarango

thanks!

ray OOM in tensor parallel

@WoosukKwon Thank you for answering my problem! When I try the swap_space, the problem has not been solved. my code is here: from vllm import LLM model_path = 'yahma/llama-13b-hf' llama_model...

ray OOM in tensor parallel

> > Me too. May be the Ray memory monitor detected memory usage incorrectly ? because I found there were a lot of memory occupied by system buffer/cache, and Ray...

How to load a local model file?

I also have the same problem.

How to load a local model file?

Indeed. I built from source: https://github.com/vllm-project/vllm/releases/tag/v0.1.1, and this problem solved.