lightllm
lightllm copied to clipboard
LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.
TODO: - [x] dp - [x] chunk transfer - [ ] multimodal
See the detailed usage in `start_server.sh` in root directory.
# [Feature] 在LightLLM图像生成中集成ImageForLLM元数据嵌入功能 ## 概述 建议在LightLLM推理框架中集成ImageForLLM库,使LLM生成的图像能自动包含元数据信息,便于后续对话中模型理解图像内容和生成方式。 ## 详细描述 ImageForLLM是一个简单但强大的库,能够在图像中嵌入元数据,包括生成源代码、图表属性和AI生成信息。通过在LightLLM中集成这一功能,可以: 1. 自动为AI生成的图像添加元数据(模型、提示词、参数等) 2. 在多轮对话中,使模型对之前生成的图像有更好的理解 3. 提高用户体验,无需额外步骤即可获得带有上下文的图像 ## 实现建议 集成非常简单,对于AI生成的图像,建议在图像处理流程中添加: ```python import imageforllm imageforllm.add_ai_metadata(image_path, model, prompt, parameters) ``` ## 优势 1. 极低的实现成本...
Must use with Radix Cache, will store all inputs into disk right after prefill (parallel), and when new request arrive, if disk cache len > gpu cache len && gpu...
Hi team, I am new to LightLLM and looking to add speculative decoding support. First, add n-gram and then medusa/eagle. I am wondering if it is something the team is...
use --profiler=MODE to enable, currently support torch_profile and nvtx (use with NVIDIA Nsight system) mode