lxnlxnlxnlxnlxn

Results 3 issues of lxnlxnlxnlxnlxn

# LightLLM运行过程 复现kvoff分支 ##### 第一步:创建docker 拉取镜像:`docker pull ghcr.io/modeltc/lightllm:main` llama-7b模型过大,在服务器的docker中直接clone总是发生网络中断,因此我将该模型下载到本地,通过Xftp传输到服务器中,而后在创建docker时将模型文件夹映射到lightllm源码的models文件夹中。 模型仓库:[[huggyllama/llama-7b · Hugging Face](https://huggingface.co/huggyllama/llama-7b)](https://huggingface.co/huggyllama/llama-7b) ``` docker run -itd --ipc=host --net=host --name lxn_lightllm --gpus all -p 8080:8080 -v /hdd/lxn/llama-7b:/lightllm/lightllm/models/llama-7b ghcr.io/modeltc/lightllm:main /bin/bash ```...

bug

The line-migration process is set to be multi-round. Let's say that, in the first round, tokens 1 to 10 of the request is send from the source node to the...

Question

I was working on KV Cache offloading and recomputation. I wandoring whether DeepSpeedMII has implemented offloading and recomputation technique, if so, where is the API?