nullname comments

Results 104 comments of


                                            nullname

llama-cli on Hexagon-NPU introducing a lot of extra time

And for the `graph_compute cost`, thought we can have a breakdown with the profiler output that capture from logcat

llama-cli on Hexagon-NPU introducing a lot of extra time

> what other inference frameworks support Qualcomm NPU deployment? Can check out [MLLM](https://github.com/UbiquitousLearning/mllm) - they have their own model format and NPU support. > Does Qualcomm officially use QNN-SDK and...

llama-cli on Hexagon-NPU introducing a lot of extra time

Hey @finneyyan , it’s been a while! We’ve made several improvements to the hexagon-npu backend and are seeing significant performance gains. When you have a moment, could you test your...

Feature Request: 如何支持视觉大模型推理

Hi > 当前支持视觉大模型在npu上的调度吗这个backend在hexagon的npu上实现了部分op，所以只要是支持的op，会跑在npu上 > 似乎没有在文档或者其他地方看见有关性能测试的数据现阶段有针对op的测试，可以参考下我发在这个discussion里面的comment https://github.com/ggml-org/llama.cpp/discussions/8273#discussioncomment-13274821