Siyang Shao
Siyang Shao
> > > 或许我们可以宣传一下这个repo?比如发在表白墙上。。。 那真是想多了吧...表白墙大部分人不是计院的,也没有**学习git**的想法。又不是只需要伸手党,那要热度还是通过SHUOSC来做学生及教程比较简单
Hi, in the `vllm_backend` implementation, since we enable the automatic prefix caching by `vLLM`, the shared state (token 1 to 10) will not be recalculated. Refer: https://docs.vllm.ai/en/latest/automatic_prefix_caching/apc.html
[BUG] unable to convert the HuggingFace model to the ServerlessLLM format based on the documentation
Hi, I think the reason is that the released `serverless_llm_store` is too old, and in that version, integration with vLLM hasn't been done. The patch file is only for vLLM's...
[BUG] unable to convert the HuggingFace model to the ServerlessLLM format based on the documentation
Hi, I noticed that it seems that you are running `pip install .` under our project folder and installing the `serverless_llm` from the source. But actually, the checkpoint loader (aka...
[BUG] unable to convert the HuggingFace model to the ServerlessLLM format based on the documentation
Here's the correct info under `__init__.py`: ``` from .sllm_store import load_dict, load_model, save_model, save_dict, load_dict_single_device __all__ = ["load_model", "save_model", "load_dict", "save_dict", "load_dict_single_device"] ```
[BUG] unable to convert the HuggingFace model to the ServerlessLLM format based on the documentation
I think this error is usually because the last build cache hasn't been cleaned successfully, so the pip was directed into the wrong dir. `pip install .` on a machine...
Current bugs: - [x] `hipIpcCloseMemHandle` called successfully but not release the shared memory
> Current bugs: > > * [ ] `hipIpcCloseMemHandle` called successfully but not release the shared memory This problem seems to be HIP's problem. Create a specific issue here: https://github.com/ROCm/HIP/issues/3580
> > Current bugs: > > > > * [ ] `hipIpcCloseMemHandle` called successfully but not release the shared memory > > This problem seems to be HIP's problem. Create...
Everything goes fine after updating ROCm to 6.2.0, and GPU memory will be released successfully.