cccclai
cccclai
Thank you for the detailed documentation and get it working on your own. I actually have some pending PRs that I didn't manage to land https://github.com/pytorch/executorch/pull/15258. @haowhsu-quic @kirklandsign can we...
Yeah we should do a better job on updating the docs and provides a better user experience. @luffy-yu will you be willing to create a PR to update our readme?...
I think having them in the website will be awesome https://docs.pytorch.org/executorch/stable/backends-qualcomm.html
The website is a markdown file in the executorch code base https://github.com/pytorch/executorch/blob/4d366239f473dce680fba477eafb693942d1600d/docs/source/backends-qualcomm.md?plain=1#L4
Any specific reason that ai-benchmark and mlcommons are picked as a reference for the model list? Just curious because there are lists from other source
Hi, since it's a really big change, and MHA2SHA pass seems complicated, can you add a test for the pass here https://github.com/pytorch/executorch/blob/main/backends/qualcomm/tests/test_passes.py passes can be fragile, so I'm trying to...
> Hi @cccclai , I have rebased. Can I get a review on this PR? Yes sorry, I'll try do it tomorrow
> Combined the n_heads key-value cache into a single cache for each layer to decrease the number of inputs and outputs, which enhances performance. I feel like I still don't...
It seems like SmartMask and ShiftPointer is no longer true after this PR, can you update https://github.com/pytorch/executorch/tree/main/examples/qualcomm/oss_scripts/llama#kv-cache-update-mechanism and explain how it works?
> the combined kv cache Can you share how the combined kv cache work here? Is it the one you mentioned that will help the memory usage and improve runtime...