torch-mlir icon indicating copy to clipboard operation
torch-mlir copied to clipboard

Quantization tests seem to fail on macOS

Open powderluv opened this issue 2 years ago • 8 comments

Quantization tests seem to fail on macOS

FAIL: TORCH_MLIR :: python/importer/jit_ir/ivalue_import/quantization.py (17 of 92)
******************** TEST 'TORCH_MLIR :: python/importer/jit_ir/ivalue_import/quantization.py' FAILED ********************
Script:
--
: 'RUN: at line 10';   /Users/anush/github/torch-mlir/mlir_venv/bin/python3.10 /Users/anush/github/torch-mlir/test/python/importer/jit_ir/ivalue_import/quantization.py | /Users/anush/github/torch-mlir/build/bin/torch-mlir-opt | FileCheck /Users/anush/github/torch-mlir/test/python/importer/jit_ir/ivalue_import/quantization.py
--
Exit Code: 1

Command Output (stderr):
--
Traceback (most recent call last):
  File "/Users/anush/github/torch-mlir/test/python/importer/jit_ir/ivalue_import/quantization.py", line 38, in <module>
    test_module = TestModule()
  File "/Users/anush/github/torch-mlir/test/python/importer/jit_ir/ivalue_import/quantization.py", line 17, in __init__
    self.linear = torch.nn.quantized.Linear(5, 2, dtype=torch.qint8)
  File "/Users/anush/github/torch-mlir/mlir_venv/lib/python3.10/site-packages/torch/nn/quantized/modules/linear.py", line 146, in __init__
    self._packed_params = LinearPackedParams(dtype)
  File "/Users/anush/github/torch-mlir/mlir_venv/lib/python3.10/site-packages/torch/nn/quantized/modules/linear.py", line 24, in __init__
    self.set_weight_bias(wq, None)
  File "/Users/anush/github/torch-mlir/mlir_venv/lib/python3.10/site-packages/torch/nn/quantized/modules/linear.py", line 29, in set_weight_bias
    self._packed_params = torch.ops.quantized.linear_prepack(weight, bias)
  File "/Users/anush/github/torch-mlir/mlir_venv/lib/python3.10/site-packages/torch/_ops.py", line 148, in __call__
    return self._op(*args, **kwargs or {})
RuntimeError: Didn't find engine for operation quantized::linear_prepack NoQEngine
/Users/anush/github/torch-mlir/test/python/importer/jit_ir/ivalue_import/quantization.py:22:11: error: CHECK: expected string not found in input
 # CHECK: %[[SCALE:.*]] = torch.constant.float
          ^
<stdin>:1:1: note: scanning from here
module {
^

Input file: <stdin>
Check file: /Users/anush/github/torch-mlir/test/python/importer/jit_ir/ivalue_import/quantization.py

-dump-input=help explains the following input dump.

Input was:
<<<<<<
          1: module { 
check:22     X~~~~~~~~ error: no match found
          2: } 
check:22     ~~
          3:  
check:22     ~
>>>>>>

--

********************
********************
Failed Tests (1):
  TORCH_MLIR :: python/importer/jit_ir/ivalue_import/quantization.py


Testing Time: 4.48s
  Passed: 91
  Failed:  1
FAILED: tools/torch-mlir/test/CMakeFiles/check-torch-mlir /Users/anush/github/torch-mlir/build/tools/torch-mlir/test/CMakeFiles/check-torch-mlir 
cd /Users/anush/github/torch-mlir/build/tools/torch-mlir/test && /Users/anush/github/torch-mlir/mlir_venv/bin/python3.10 /Users/anush/github/torch-mlir/build/./bin/llvm-lit -sv /Users/anush/github/torch-mlir/build/tools/torch-mlir/test
ninja: build stopped: subcommand failed.

powderluv avatar Jul 05 '22 03:07 powderluv

Related? Maybe we need to enable QNNPACK in the libtorch build on arm macs? Or check on FBGEMM on arm? Not entirely sure.

rdadolf avatar Jul 05 '22 16:07 rdadolf

looks like we should just xfail it on macOS based on the issue you posted.

powderluv avatar Jul 21 '22 05:07 powderluv

I see the same failure on AArch64 Linux when building pytorch from source . This appears to be the same issue on Apple silicon Macs and AArch64 Linux environments as well.

u99127 avatar Aug 05 '22 21:08 u99127

It is an upstream issue. Does enabling FBGEMM help ? There is a combination of FBGEMM and pytorch qnnpack that helps.

powderluv avatar Aug 05 '22 21:08 powderluv

build_pytorch in build_tools/build_libtorch.sh has USE_FBGEMM=ON , USE_PYTORCH_QNNPACK=ON , USE_QNNPACK=OFF -

Maybe one should experiment with USE_QNNPACK=ON as well perhaps ?

R

u99127 avatar Aug 05 '22 21:08 u99127

I tried and it didn't seem to help. BTW are you building mlir_tblgen for the host first ?

powderluv avatar Aug 05 '22 21:08 powderluv

I'm puzzled by the mlir_tblgen comment as my development environment is AArch64 linux .


$> uname -a 
Linux ubuntu-linux-20-04-desktop 5.4.0-66-generic #74-Ubuntu SMP Wed Jan 27 22:56:23 UTC 2021 aarch64 aarch64 aarch64 GNU/Linux
$>file build/bin/mlir-tblgen 
bin/mlir-tblgen: ELF 64-bit LSB shared object, ARM aarch64, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux-aarch64.so.1, BuildID[sha1]=473a8231eeead677a99bb4d8d85d648fdb0cb288, for GNU/Linux 3.7.0, not stripped

And FTR I'm building with the following options.

cmake -GNinja -Bbuild \
      -DCMAKE_BUILD_TYPE=Release \
      -DCMAKE_C_COMPILER=clang \
      -DCMAKE_CXX_COMPILER=clang++ \
      -DPython3_FIND_VIRTUALENV=ONLY \
      -DLLVM_ENABLE_PROJECTS=mlir \
      -DLLVM_EXTERNAL_PROJECTS="torch-mlir;torch-mlir-dialects" \
      -DLLVM_EXTERNAL_TORCH_MLIR_SOURCE_DIR=`pwd` \
      -DLLVM_EXTERNAL_TORCH_MLIR_DIALECTS_SOURCE_DIR=`pwd`/externals/llvm-external-projects/torch-mlir-dialects \
      -DMLIR_ENABLE_BINDINGS_PYTHON=ON \
      -DLLVM_TARGETS_TO_BUILD=host \
      -DTORCH_MLIR_USE_INSTALLED_PYTORCH=OFF \
       -DCMAKE_C_COMPILER_LAUNCHER=ccache -DCMAKE_CXX_COMPILER_LAUNCHER=ccache \
      externals/llvm-project/llvm

u99127 avatar Aug 05 '22 21:08 u99127

I thought you were cross compiling for AArch64 from X86_64 - in that case you would hit: https://github.com/llvm/torch-mlir/issues/1094 but looks like you are natively building on AArch64 so you should be good.

powderluv avatar Aug 06 '22 02:08 powderluv