Shawn Zhao
Shawn Zhao
### OpenVINO Version 2024.0/2024.1 ### Operating System Other (Please specify in description) ### Device used for inference GPU ### Framework None ### Model used _No response_ ### Issue description Platform...
### OpenVINO Version 2024.0/2024.1 ### Operating System Ubuntu 20.04 (LTS) ### Device used for inference CPU ### Framework None ### Model used _No response_ ### Issue description Can not convert...
--------------------------------------------------------------------------- TypeError Traceback (most recent call last) Cell In[2], [line 2](vscode-notebook-cell:?execution_count=2&line=2) [1](vscode-notebook-cell:?execution_count=2&line=1) # 加载模型 ----> [2](vscode-notebook-cell:?execution_count=2&line=2) model = AutoAWQForCausalLM.from_pretrained(model_name_or_path, trust_remote_code=True) [3](vscode-notebook-cell:?execution_count=2&line=3) tokenizer = AutoTokenizer.from_pretrained(model_name_or_path, trust_remote_code=True) File ~/miniforge3/envs/peft/lib/python3.10/site-packages/awq/models/auto.py:55, in AutoAWQForCausalLM.from_pretrained(self, model_path,...
### System Info / 系統信息 请问用什么工具可以INT4 量化 ChatGLM3-6B这个模型? ### Who can help? / 谁可以帮助到您? _No response_ ### Information / 问题信息 - [ ] The official example scripts / 官方的示例脚本 -...
Qwen2.5-VL-3B-Instruct can not infer a picture HW: 1 Intel ARC770 SW: Ubuntu 22.04 Docker Image: intelanalytics/ipex-llm-serving-xpu 0.8.3-b21 Model:Qwen2.5-VL-3B-Instruct Precision: fp8 Steps to reproduce the error: 1. Run the command to...
镜像:使用intelanalytics/ipex-llm-serving-xpu:0.8.3-b19 或者intelanalytics/ipex-llm-serving-xpu:0.8.3-b21镜像 模型: DeepSeek-R1-Distill-Qwen-14B FP16模型 工具: Lighteval 数据集 :AIME24 case。 无法完成测试, 一共测了三次 一次跑到73% 后无响应, 一次跑到 75%后无响应, 还有一次跑到 77%后无响应。 **How to reproduce** Steps to reproduce the error: 1. ... 2. ......
镜像:使用 intelanalytics/ipex-llm-serving-xpu:0.8.3-b21镜像 模型: DeepSeek-R1-Distill-Qwen-32B 数据精度:FP8 或者FP16 工具: lm-evaluation-harness 数据集 :MMLU 问题:使用 Harness 评估DS-32B INT4模型精度的时候, 跑不起来,跑一会显示 OOM
镜像:使用intelanalytics/ipex-llm-serving-xpu:0.8.3-b19 或者intelanalytics/ipex-llm-serving-xpu:0.8.3-b21镜像 模型: DeepSeek-R1-Distill-Qwen-32B SYM_INT4 模型 工具: Lighteval 数据集 :MMLU Benchmark后精度结果值偏低才27.67%。 DeepSeek-R1-Distill-Qwen-32B INT4 模型 在NV A100上Benchmark的精度值为78.82% (WrapperWithLoadBit pid=10769) 2025:06:13-12:30:17:(10769) |CCL_WARN| device_family is unknown, topology discovery could be incorrect, it might...