lightllm icon indicating copy to clipboard operation
lightllm copied to clipboard

LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

Results 125 lightllm issues
Sort by recently updated
recently updated
newest added

Where does lightllm_ppl_int8kv_flashdecoding_kernel locate in ?

bug

有不通过http的其他推理入口吗

bug

Just as the title says. https://huggingface.co/OpenGVLab/InternVL-Chat-V1-5

bug

Link: https://huggingface.co/microsoft/Florence-2-large

bug

https://github.com/ModelTC/lightllm/blob/main/lightllm/common/basemodel/triton_kernel/dequantize_gemm_int4.py I notice there are several param candidates to tune for this kernel. I wonder how do you find these values? are they suitable for specific hardware or universal? Do...

python -m lightllm.server.api_server --model_dir /root/autodl-tmp/Qwen2-7B-Instruct --host 0.0.0.0 --port 8000 --trust_remote_code --model_name Qwen2-7B-Instruct --data_type=bfloat16 --eos_id 151643 --tokenizer_mode fast ,通过以上命令启动服务正常,但是发送openai格式的请求后,报以下输入类型错误。请问如何解决 ![image](https://github.com/user-attachments/assets/f829a1a5-28db-4725-8bef-9ead6f82086b)