executorch
executorch copied to clipboard
Duplicate registration of quantiation operators, e.g. quantized_decomposed::embedding_byte.out
PR https://github.com/pytorch/executorch/pull/3119 tidied up llama by removing a library build and op registration.
This seems to have interacted poorly with (broken the cmake flow):
examples/xnnpack/quantization/test_quantize.sh
resulting in duplicate registration of:
quantized_decomposed::embedding_byte.out
this works for the buck2 build, but not the cmake build of the library I assume because kernels/quantized/targets.bzl
was updated.
removing the duplicates in kernels/quantized/quantized.yaml breaks llama2 quantization runs on xnnpack. removing the duplicates in exir/passes/_quant_patterns_and_replacements.py causes llama testing to fail.
Can we look for a cleaner solution on quantized operator registration that is consistent across users? the xnnpack, arm backend and llama2 uses at least.
I have a patch demonstrating the problem here, as this is blocking a fairly large commit of Arm backend code: https://github.com/pytorch/executorch/pull/3056/commits/0790d93c7b85cc549ff1324f8b580ba92356b49d
@larryliu0820 @digantdesai - thoughts welcome, i assume propagating the buck2 fixes into the cmake build of the quantized op aot registration is the simple clean one, but how/where to do this cleanly is unclear to me.
Yeah I can take a look
I'm iterating on this PR: https://github.com/pytorch/executorch/pull/3382. This should allow us to select the ops to be registered in quantized_ops_aot_lib
Thanks, commented on a small issue on the PR.
I also find the flow with CMAKE is probably not tested even in ciflow/trunk at the moment. I've attached a small patch which enables the existing script to run, and reproduces the duplicate registration issue:
./examples/xnnpack/quantization/test_quantize.sh cmake mv2
Thanks, commented on a small issue on the PR.
I also find the flow with CMAKE is probably not tested even in ciflow/trunk at the moment. I've attached a small patch which enables the existing script to run, and reproduces the duplicate registration issue:
./examples/xnnpack/quantization/test_quantize.sh cmake mv2
Yeah thanks for giving it a try. https://github.com/pytorch/executorch/pull/3382 doesn't change existing logic so that's why you still see the error. I'm going to merge this and add an actual fix on top of it, with proper CI.
Sent out https://github.com/pytorch/executorch/pull/3449 to fix it
Thanks @larryliu0820 that fixed my problems in #3056 and the xnnpack flow works for me too. 👍