[BUG] Qwen-7B-Chat AttributeError: 'LlamaSplitFuseInferStateInfo' object has no attribute 'logn_values'
Before you submit an issue, please search for existing issues to avoid duplicates.
Issue description: AttributeError: 'LlamaSplitFuseInferStateInfo' object has no attribute 'logn_values'
Please provide a clear and concise description of your issue.
Steps to reproduce: python -m lightllm.server.api_server --model_dir /root/autodl-tmp/Qwen-7B-Chat --tp 1 --trust_remote_code --splitfuse_mode
Please list the steps to reproduce the issue, such as:
Expected behavior:
Please describe what you expected to happen.
Error logging:
If applicable, please copy and paste the error message or stack trace here. Use code blocks for better readability.
Environment:
Please provide information about your environment, such as:
-
[ ] Using container
-
OS: (Ubuntu 14.04, CentOS7)
-
GPU info:
nvidia-smi(e.g.NVIDIA-SMI 525.116.04 Driver Version: 525.116.04 CUDA Version: 12.0)- Graphics cards: (4090x1)
-
Python: (e.g. CPython3.10)
-
LightLLm: (git commit-hash)
-
openai-triton:
pip show tritonName: triton Version: 2.1.0 Summary: A language and compiler for custom Deep Learning operations Home-page: https://github.com/openai/triton/ Author: Philippe Tillet Author-email: [email protected] License: Location: /root/miniconda3/lib/python3.10/site-packages Requires: filelock Required-by: lightllm, torch
Additional context:
Language:
lightllm run Qwen-7B-Chat use splitfuse_mode not worked.
@exceedzhang splitfuse mode is in test. so this mode only support llama and llama2 now.
we will try to support other mode types soon.
does it support qwen VL?
@ObliviousDonkey qwen VL will be supported soon.