cccclai
cccclai
> > > > Yeah that's my understanding too. However for 4 shards, we need to init(shard_1) -> destroy (shard_1) -> init(shard_2)-> destroy (shard_2) -> ..., if we do init(shard_1)...
> Spill-fill buffer sharing is optimization which is to allocate a buffer that will be shared by all the contexts of a LLM. This way, we do not need allocated...
Hey I probably need some help to fix a matmul validation error - it causes graph break but I'm not sure what's the issue. It only shows up after I...
I'm using qnn 2.23 and the matmul node meta data is success case: ``` {'stack_trace': ' File "/data/users/chenlai/executorch/examples/models/llama2/llama_transformer.py", line 492, in forward\n h = layer(\n File "/home/chenlai/local/miniconda3/envs/executorch/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in...
I feel like there is still a bit misalignment 😅 ...but if the composite llama includes multiple .pte, I think I get what I need now. back to debugging the...
I set `debug=True` in the compile spec and get more logging ``` [INFO] [Qnn ExecuTorch]: Validating Op Config aten_matmul_default_1. [INFO] [Qnn ExecuTorch]: Validating Op Type MatMul == MatMul. [INFO] [Qnn...
Hey I may need to get https://github.com/pytorch/executorch/pull/5925 checked in first because it needs to get cherry pick. Then I'll merge this one.
> > Hey I may need to get #5925 checked in first because it needs to get cherry pick. Then I'll merge this one. > > Sounds good! I can...
Hi it's ready to merge now, can you rebase? Thank you
what does the graph look like for this toy model?