executorch Duplicate registration of quantiation operators, e.g. quantized_decomposed::embedding

Duplicate registration of quantiation operators, e.g. quantized_decomposed::embedding_byte.out

Open robell opened this issue 9 months ago • 5 comments

PR https://github.com/pytorch/executorch/pull/3119 tidied up llama by removing a library build and op registration.

This seems to have interacted poorly with (broken the cmake flow): examples/xnnpack/quantization/test_quantize.sh resulting in duplicate registration of: quantized_decomposed::embedding_byte.out

this works for the buck2 build, but not the cmake build of the library I assume because kernels/quantized/targets.bzl was updated.

removing the duplicates in kernels/quantized/quantized.yaml breaks llama2 quantization runs on xnnpack. removing the duplicates in exir/passes/_quant_patterns_and_replacements.py causes llama testing to fail.

Can we look for a cleaner solution on quantized operator registration that is consistent across users? the xnnpack, arm backend and llama2 uses at least.

I have a patch demonstrating the problem here, as this is blocking a fairly large commit of Arm backend code: https://github.com/pytorch/executorch/pull/3056/commits/0790d93c7b85cc549ff1324f8b580ba92356b49d

Apr 26 '24 10:04 robell

@larryliu0820 @digantdesai - thoughts welcome, i assume propagating the buck2 fixes into the cmake build of the quantized op aot registration is the simple clean one, but how/where to do this cleanly is unclear to me.

Apr 26 '24 10:04 robell

Yeah I can take a look

Apr 26 '24 18:04 larryliu0820

I'm iterating on this PR: https://github.com/pytorch/executorch/pull/3382. This should allow us to select the ops to be registered in quantized_ops_aot_lib

Apr 29 '24 18:04 larryliu0820

Thanks, commented on a small issue on the PR.

I also find the flow with CMAKE is probably not tested even in ciflow/trunk at the moment. I've attached a small patch which enables the existing script to run, and reproduces the duplicate registration issue: ./examples/xnnpack/quantization/test_quantize.sh cmake mv2

xnnpacktest.err.txt xnnpacktest.patch

Apr 30 '24 17:04 robell

Thanks, commented on a small issue on the PR.

I also find the flow with CMAKE is probably not tested even in ciflow/trunk at the moment. I've attached a small patch which enables the existing script to run, and reproduces the duplicate registration issue: ./examples/xnnpack/quantization/test_quantize.sh cmake mv2

xnnpacktest.err.txt xnnpacktest.patch

Yeah thanks for giving it a try. https://github.com/pytorch/executorch/pull/3382 doesn't change existing logic so that's why you still see the error. I'm going to merge this and add an actual fix on top of it, with proper CI.

Apr 30 '24 20:04 larryliu0820

Sent out https://github.com/pytorch/executorch/pull/3449 to fix it

May 01 '24 19:05 larryliu0820

Thanks @larryliu0820 that fixed my problems in #3056 and the xnnpack flow works for me too. 👍

May 02 '24 11:05 robell

executorch executorch copied to clipboard

Duplicate registration of quantiation operators, e.g. quantized_decomposed::embedding_byte.out

executorch
executorch copied to clipboard