cccclai comments

Results 217 comments of


                                            cccclai

Executorch with QNN AI Engine Backend

Thank you for the detailed documentation and get it working on your own. I actually have some pending PRs that I didn't manage to land https://github.com/pytorch/executorch/pull/15258. @haowhsu-quic @kirklandsign can we...

Executorch with QNN AI Engine Backend

Yeah we should do a better job on updating the docs and provides a better user experience. @luffy-yu will you be willing to create a PR to update our readme?...

Executorch with QNN AI Engine Backend

I think having them in the website will be awesome https://docs.pytorch.org/executorch/stable/backends-qualcomm.html

Executorch with QNN AI Engine Backend

The website is a markdown file in the executorch code base https://github.com/pytorch/executorch/blob/4d366239f473dce680fba477eafb693942d1600d/docs/source/backends-qualcomm.md?plain=1#L4

Add more models as part of GA models

Any specific reason that ai-benchmark and mlcommons are picked as a reference for the model list? Just curious because there are lists from other source

Qualcomm AI Engine Direct - Add MHA2SHA pass

Hi, since it's a really big change, and MHA2SHA pass seems complicated, can you add a test for the pass here https://github.com/pytorch/executorch/blob/main/backends/qualcomm/tests/test_passes.py passes can be fragile, so I'm trying to...

Qualcomm AI Engine Direct - Add MHA2SHA pass

> Hi @cccclai , I have rebased. Can I get a review on this PR? Yes sorry, I'll try do it tomorrow

Qualcomm AI Engine Direct - Add MHA2SHA pass

> Combined the n_heads key-value cache into a single cache for each layer to decrease the number of inputs and outputs, which enhances performance. I feel like I still don't...

Qualcomm AI Engine Direct - Add MHA2SHA pass

It seems like SmartMask and ShiftPointer is no longer true after this PR, can you update https://github.com/pytorch/executorch/tree/main/examples/qualcomm/oss_scripts/llama#kv-cache-update-mechanism and explain how it works?

Qualcomm AI Engine Direct - Add MHA2SHA pass

> the combined kv cache Can you share how the combined kv cache work here? Is it the one you mentioned that will help the memory usage and improve runtime...