Qualcomm AI Engine Direct - alias_copy op
Summary
Remove alias_copy op.
Test plan
Add UTs to ensure alias_copy is removed.
:link: Helpful Links
:test_tube: See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/10319
- :page_facing_up: Preview Python docs built from this PR
Note: Links to docs will display an error until the docs builds have been completed.
:x: 1 New Failure
As of commit 4b587c0fed1754fd5a492d34ddf6825daf50a51e with merge base 4e38f4ac541d0e81cd7b8f62660010eb7ca09a5e ():
NEW FAILURE - The following job has failed:
-
pull / test-llava-runner-linux / linux-job (gh)
RuntimeError: Command docker exec -t 92c37ffcdda88d206e2c3b3eef3e8691e272d1a47ad1f315c6d306e7b8682a1d /exec failed with exit code 139
This comment was automatically generated by Dr. CI and updates every 15 minutes.
Hi @cccclai, @billmguo,
This PR is the remove alias_copy operation.
Please have a look.
Thanks
@cccclai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.
It seems like the test is failing
AssertionError: False is not true : ref_output:
tensor([[ 0.5914, -0.8622, 0.0214, -0.3349, 0.0285, -0.9833, -1.2256, 0.3349,
-0.7197, 0.3278]])
model_output:
tensor([[ 0.5914, 0.5914, -1.2185, -1.2185, 0.5914, -1.2185, -1.2185, -1.2185,
-1.2185, -1.2185]])
It seems like the test is failing
AssertionError: False is not true : ref_output: tensor([[ 0.5914, -0.8622, 0.0214, -0.3349, 0.0285, -0.9833, -1.2256, 0.3349, -0.7197, 0.3278]]) model_output: tensor([[ 0.5914, 0.5914, -1.2185, -1.2185, 0.5914, -1.2185, -1.2185, -1.2185, -1.2185, -1.2185]])
Thanks for sharing the results. I think this is failing by chances. I tested locally a couple times, and it all passed. I have changed to a simpler OP instead since the purpose here is to verify alias_copy is handled properly. Would you mind if I push another PR for UT refactor to add a new class for passes related stuff later on? I think we will need to discuss internally first on how to refactor the UT classes. Thanks
Actually the failing might due to this change https://github.com/pytorch/executorch/pull/10362, can you help checking if the test is still passing with https://github.com/pytorch/pytorch/pull/151436?
I have been looking at the failure and still haven't made much progress. It looks like qualcomm passes do something special to exported artifact state dicts down the line. Is it accurate? Can you point to me places where you deal with module state dicts?
Actually the failing might due to this change #10362, can you help checking if the test is still passing with pytorch/pytorch#151436?
I have been looking at the failure and still haven't made much progress. It looks like qualcomm passes do something special to exported artifact state dicts down the line. Is it accurate? Can you point to me places where you deal with module state dicts?
Hi @cccclai, @tugsbayasgalan,
Yes I am able to reproduce the issue after bumping the torch nightly version to "dev20250422".
I believe the reason is that we are trying to convert the quant weights back to floating point during https://github.com/pytorch/executorch/blob/d31ef13a463d2bd20ec7699bbfdbd570b4f318c4/backends/qualcomm/_passes/annotate_quant_attrs.py#L144
However, after the bumping the torch version, it seems like the FP weights are not properly saved.
We are using the following functions to get and set parameters. https://github.com/pytorch/executorch/blob/d31ef13a463d2bd20ec7699bbfdbd570b4f318c4/backends/qualcomm/builders/utils.py#L30 https://github.com/pytorch/executorch/blob/d31ef13a463d2bd20ec7699bbfdbd570b4f318c4/backends/qualcomm/builders/utils.py#L47
It would be appreciated if you could share the best practices on handling parameters modification. Thanks
I am still confused how this logic could break your flow tho. Before returning state dict, i just shallow copy in export meaning they should be the same state dict lol. In what part of set_parameter does it fail?
I am still confused how this logic could break your flow tho. Before returning state dict, i just shallow copy in export meaning they should be the same state dict lol. In what part of set_parameter does it fail?
I think I might know why it is not updating, but I need some time to verify if it is actually the root cause as I am currently blocked by something else. I think when we are initializing the passes, we passed the exported_program into the constructor. https://github.com/pytorch/executorch/blob/9e59c198dde0f4d4c4bb714e7f1de3b389cfc577/backends/qualcomm/_passes/qnn_pass_manager.py#L160 Next, during the lowering process, when decomposition is called, we get a new exported_program with a shallow copy new_state_dict. https://github.com/pytorch/pytorch/blob/ad81eeb7c7c906e0cdd04a5cc8fdb9592281c317/torch/export/exported_program.py#L880 Then, when we are running the passes, we are updating our values to old state_dict in the old exported_program. This is why we keep on getting incorrect params during QNN op builder.
Please let me know if there's anything that is still unclear. Thanks
Ahh this makes sense. You should always work with state_dict of ep after running decompositions because the state dict can change after decompositions.
@winskuo-quic Just checking in, will you be working on making the qualcomm code compatible with new export behaviour?
@winskuo-quic Just checking in, will you be working on making the qualcomm code compatible with new export behaviour?
Hi @tugsbayasgalan, Thanks for the suggestions. We will work on Qualcomm code to make it compatible with new export behavior. As you mentioned, it is probably safer to work on state dict of the ep after decomposition.
Actually the failing might due to this change #10362, can you help checking if the test is still passing with pytorch/pytorch#151436?
Any update on this change? https://github.com/pytorch/executorch/pull/10362 is currently pending due to the breakage
Actually the failing might due to this change #10362, can you help checking if the test is still passing with pytorch/pytorch#151436?
Any update on this change? #10362 is currently pending due to the breakage
Sorry for the delay on pushing the PR. We came up with couple solutions and were discussing internally which one is the best to resolve the uplevel error. We will share the PR later today.
Given that CI is green again, we can resume merging PRs now. Mind rebasing? This seems to be the oldest PR that needs to land
Given that CI is green again, we can resume merging PRs now. Mind rebasing? This seems to be the oldest PR that needs to land
Just rebased. Thanks for reminder
@cccclai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.