markluofd issues

Results 6 issues of


                                            markluofd

With_python=ON，x86上交叉编译armv7/armv8 编译报错 fatal error: aarch64-linux-gnu/python3.5m/pyconfig.h: No such file or directory

PaddleLite v2.11 Host环境： Ubuntu16.04 交叉环境 build_linux.sh配置： ARCH=armv7 WITH_PYTHON=ON PY_VERSION="3.5" 最后报错信息如下： ![image](https://user-images.githubusercontent.com/94596925/184828226-c0a66532-18ca-4e75-b121-1d03bbfb7ce8.png) 想请问交叉编译 with_python=ON，这个报错应该怎么处理？

[Performance]: Qwen 7b chat model, under 128 concurrency, the CPU utilization rate is 100%, and the GPU SM utilization rate is only about 60%-75%. Is it a CPU bottleneck?

### Proposal to improve performance _No response_ ### Report of performance regression _No response_ ### Misc discussion on performance I am using vllm to deploy the qwen 7b chat model...

performance

手动更新驱动具体应该更新板子上什么路径？

根据文档里面提示，确认我手上的rk3399pro开发板是pcie模式，全局搜索npu_fw目录，发现在/usr/share下面有三个类似目录，分别是 npu_fw、npu_fw_pcie、npu_fw_pcie_optimization，这种情况应该更新哪个目录下的固件？备注：这边是一年前刷的rk3399pro官方的ubuntu18.04的操作系统

请问当前rk3399pro 1.7.3版本的driver和c api是稳定的吗？

请问当前rk3399pro 1.7.3版本的driver和c api是稳定的吗？准备集成一下哈哈

llama_int8 do not support do_sample=True

### Describe the bug with demo run_llama_int8.py, setting generate_kwargs["do_sample"] to be True, I got the error as follows: command: python run_llama_int8.py -m ${MODEL_ID} --quantized-model-path "/workspace/saved_results/best_model.pt" --benchmark --jit --int8-bf16-mixed --num-iter 5...

CPU

Crash

LLM

In distributed scenarios, how to configure k8s ports and which ports need to be opened?

2 H800 node deploying DeepSeek R1 model Using k8s deployment. If the pod only opens the server port and the port agreed by dist-init-addr, the worker will report an error...