cccclai comments

Results 217 comments of


                                            cccclai

[Draft] Qualcomm AI Engine Direct -Enable story llama model in quantied and fp

> > > > Yeah that's my understanding too. However for 4 shards, we need to init(shard_1) -> destroy (shard_1) -> init(shard_2)-> destroy (shard_2) -> ..., if we do init(shard_1)...

[Draft] Qualcomm AI Engine Direct -Enable story llama model in quantied and fp

> Spill-fill buffer sharing is optimization which is to allocate a buffer that will be shared by all the contexts of a LLM. This way, we do not need allocated...

[Draft] Qualcomm AI Engine Direct -Enable story llama model in quantied and fp

Hey I probably need some help to fix a matmul validation error - it causes graph break but I'm not sure what's the issue. It only shows up after I...

[Draft] Qualcomm AI Engine Direct -Enable story llama model in quantied and fp

I'm using qnn 2.23 and the matmul node meta data is success case: ``` {'stack_trace': ' File "/data/users/chenlai/executorch/examples/models/llama2/llama_transformer.py", line 492, in forward\n h = layer(\n File "/home/chenlai/local/miniconda3/envs/executorch/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in...

[Draft] Qualcomm AI Engine Direct -Enable story llama model in quantied and fp

I feel like there is still a bit misalignment 😅 ...but if the composite llama includes multiple .pte, I think I get what I need now. back to debugging the...

[Draft] Qualcomm AI Engine Direct -Enable story llama model in quantied and fp

I set `debug=True` in the compile spec and get more logging ``` [INFO] [Qnn ExecuTorch]: Validating Op Config aten_matmul_default_1. [INFO] [Qnn ExecuTorch]: Validating Op Type MatMul == MatMul. [INFO] [Qnn...

Qualcomm AI Engine Direct - Support topk

Hey I may need to get https://github.com/pytorch/executorch/pull/5925 checked in first because it needs to get cherry pick. Then I'll merge this one.

Qualcomm AI Engine Direct - Support topk

> > Hey I may need to get #5925 checked in first because it needs to get cherry pick. Then I'll merge this one. > > Sounds good! I can...

Qualcomm AI Engine Direct - Support topk

Hi it's ready to merge now, can you rebase? Thank you

Mutable buffer fails to lower to QNN backend

what does the graph look like for this toy model?