cccclai

Results 217 comments of cccclai

Is it after `torch.export` or `to_edge`? Mind sharing the repro script?

Not yet - I think it's similar to this issue: https://github.com/pytorch/executorch/issues/4042

Hmm does it work on runtime? I sort of doubt it...

@leigao97 hey just would like to follow up on this, are you still blocked on the issue?

Are you using your jni layer? We actually have a example android demo app that works with the runner https://github.com/meta-pytorch/executorch-examples/tree/main/llm/android/LlamaDemo

Maybe let's try step by step 1. Are you able to run a simple model on your phone? https://github.com/pytorch/executorch/tree/main/examples/qualcomm#simple-examples-to-verify-the-backend-is-working 2. Are you able to run the llama model via adb?...

> > Maybe let's try step by step > > > > 1. Are you able to run a simple model on your phone? https://github.com/pytorch/executorch/tree/main/examples/qualcomm#simple-examples-to-verify-the-backend-is-working > > 2. Are you...

Oh wait, you were generating the model with this command ``` python -m extension.llm.export.export_llm base.checkpoint="${MODEL_DIR}/consolidated.00.pth" base.params="${MODEL_DIR}/params.json" model.use_kv_cache=True model.enable_dynamic_shape=False backend.qnn.enabled=True quantization.pt2e_quantize="qnn_16a4w" model.dtype_override="fp32" base.metadata='"{\"get_bos_id\":128000, \"get_eos_ids\":[128009, 128001]}"' export.output_name="test.pte" ``` I feel like this...

Oh, I think SM8450 lack of support for weight sharing and block wise quantization (cc: @haowhsu-quic correct me if I'm wrong). If you want to do SM8450, you may need...

Maybe follow the instructions in https://github.com/pytorch/executorch/issues/15410