torch-mlir icon indicating copy to clipboard operation
torch-mlir copied to clipboard

Disable TORCH_MLIR_ENABLE_PYTORCH_EXTENSIONS

Open AmosLewis opened this issue 1 year ago • 15 comments

Related discussion: https://discourse.llvm.org/t/drastically-reducing-documented-scope-of-project/80484

To fix the no module named 'torch_mlir._mlir_libs._jit_ir_importer' bug when test pytorch related models like pytorch/models/mit-b0

python ./run.py  --tolerance 0.001 0.001 --cachedir /proj/gdba/shark/cache --ireebuild ../../iree-build -f pytorch -g models --mode onnx --report --tests  pytorch/models/mit-b0 
Starting e2eshark tests. Using 4 processes
Cache Directory: /proj/gdba/shark/cache
Tolerance for comparing floating point (atol, rtol) = (0.001, 0.001)
Note: No Torch MLIR build provided using --torchmlirbuild. iree-import-onnx will be used to convert onnx to torch onnx mlir
IREE build: /proj/gdba/shark/chi/src/iree-build
Test run directory: /proj/gdba/shark/chi/src/SHARK-TestSuite/e2eshark/test-run
Since --tests or --testsfile was specified, --groups ignored
Framework:pytorch mode=onnx backend=llvm-cpu runfrom=model-run runupto=inference
Test list: ['pytorch/models/mit-b0']
Test pytorch/models/mit-b0 failed [model-run]
Generated status report /proj/gdba/shark/chi/src/SHARK-TestSuite/e2eshark/test-run/statusreport.md
Generated time report /proj/gdba/shark/chi/src/SHARK-TestSuite/e2eshark/test-run/timereport.md
Generated summary report /proj/gdba/shark/chi/src/SHARK-TestSuite/e2eshark/test-run/summaryreport.md

If I use :

pip install \                                                                     
            --find-links https://github.com/llvm/torch-mlir-release/releases/expanded_assets/dev-wheels \
            --upgrade \
            torch-mlir

pip list:

torch-mlir         20240820.189

Error in model-run.log

python runmodel.py  --torchmlirimport fximport --todtype default --mode onnx --outfileprefix mit-b0 1> model-run.log 2>&1
...
Traceback (most recent call last):
  File "/proj/gdba/shark/chi/src/SHARK-TestSuite/e2eshark/test-run/pytorch/models/mit-b0/runmodel.py", line 78, in <module>
    from torch_mlir import torchscript
  File "/proj/gdba/shark/chi/src/SHARK-TestSuite/e2eshark/e2e_venv/lib/python3.10/site-packages/torch_mlir/torchscript.py", line 25, in <module>
    from torch_mlir.jit_ir_importer import ClassAnnotator, ImportOptions, ModuleBuilder
  File "/proj/gdba/shark/chi/src/SHARK-TestSuite/e2eshark/e2e_venv/lib/python3.10/site-packages/torch_mlir/jit_ir_importer/__init__.py", line 14, in <module>
    from .._mlir_libs._jit_ir_importer import *
ModuleNotFoundError: No module named 'torch_mlir._mlir_libs._jit_ir_importer'

The error disappear if I pip uninstall torch-mlir and use my local build 0820 with this patch export PYTHONPATH=${PYTHONPATH}:/proj/gdba/shark/chi/src/torch-mlir/build/tools/torch-mlir/python_packages/torch_mlir

AmosLewis avatar Aug 20 '24 23:08 AmosLewis

Get an error No module named 'torch_mlir_e2e_test'

  FAIL: TORCH_MLIR :: python/fx_importer/sparsity/sparse_test.py (95 of 105)
  ******************** TEST 'TORCH_MLIR :: python/fx_importer/sparsity/sparse_test.py' FAILED ********************
  Exit Code: 2
  
  Command Output (stderr):
  --
  RUN: at line 6: /opt/python/cp311-cp311/bin/python /_work/torch-mlir/torch-mlir/test/python/fx_importer/sparsity/sparse_test.py | FileCheck /_work/torch-mlir/torch-mlir/test/python/fx_importer/sparsity/sparse_test.py
  + FileCheck /_work/torch-mlir/torch-mlir/test/python/fx_importer/sparsity/sparse_test.py
  + /opt/python/cp311-cp311/bin/python /_work/torch-mlir/torch-mlir/test/python/fx_importer/sparsity/sparse_test.py
  Traceback (most recent call last):
    File "/_work/torch-mlir/torch-mlir/test/python/fx_importer/sparsity/sparse_test.py", line 19, in <module>
      from torch_mlir_e2e_test.linalg_on_tensors_backends.refbackend import (
  ModuleNotFoundError: No module named 'torch_mlir_e2e_test'
  FileCheck error: '<stdin>' is empty.
  FileCheck command line:  FileCheck /_work/torch-mlir/torch-mlir/test/python/fx_importer/sparsity/sparse_test.py

@stellaraccident Any suggestions, should we just disable the TORCH_MLIR_ENABLE_JIT_IR_IMPORTER?

AmosLewis avatar Aug 21 '24 01:08 AmosLewis

Yeah, you can try that. I think we could also just move the e2e test tools out of that and into the main project. Would need to look at it some more.

stellaraccident avatar Aug 21 '24 01:08 stellaraccident

Yeah, you can try that. I think we could also just move the e2e test tools out of that and into the main project. Would need to look at it some more.

Run Linalg e2e integration tests
  Traceback (most recent call last):
    File "<frozen runpy>", line 198, in _run_module_as_main
    File "<frozen runpy>", line 88, in _run_code
    File "/_work/torch-mlir/torch-mlir/projects/pt1/e2e_testing/main.py", line 20, in <module>
      from torch_mlir_e2e_test.configs import (
    File "/_work/torch-mlir/torch-mlir/build/tools/torch-mlir/python_packages/torch_mlir/torch_mlir_e2e_test/configs/__init__.py", line [7](https://github.com/llvm/torch-mlir/actions/runs/10482533744/job/29033805586?pr=3654#step:8:8), in <module>
      from .linalg_on_tensors_backend import LinalgOnTensorsBackendTestConfig
    File "/_work/torch-mlir/torch-mlir/build/tools/torch-mlir/python_packages/torch_mlir/torch_mlir_e2e_test/configs/linalg_on_tensors_backend.py", line [9](https://github.com/llvm/torch-mlir/actions/runs/10482533744/job/29033805586?pr=3654#step:8:10), in <module>
      from torch_mlir import torchscript
    File "/_work/torch-mlir/torch-mlir/build/tools/torch-mlir/python_packages/torch_mlir/torch_mlir/torchscript.py", line 25, in <module>
      from torch_mlir.jit_ir_importer import ClassAnnotator, ImportOptions, ModuleBuilder
  ModuleNotFoundError: No module named 'torch_mlir.jit_ir_importer'

@stellaraccident The tosa/stablehlo/linalg default e2e test will use the from torch_mlir import torchscript during test config, so it will lead to CI test fail. Is it ok to just delete all of these 3 default e2e tests since we already have the fx_importer/fx_importer_stablehlo/fx_importer_tosa?

AmosLewis avatar Aug 21 '24 21:08 AmosLewis

Yes, go ahead and delete. We can't support the old thing anymore and have replacements.

stellaraccident avatar Aug 21 '24 21:08 stellaraccident

Run Linalg e2e integration tests
  Traceback (most recent call last):
    File "<frozen runpy>", line 19[8](https://github.com/llvm/torch-mlir/actions/runs/10498026005/job/29082120990?pr=3654#step:8:9), in _run_module_as_main
    File "<frozen runpy>", line 88, in _run_code
    File "/_work/torch-mlir/torch-mlir/projects/pt1/e2e_testing/main.py", line 20, in <module>
      from torch_mlir_e2e_test.configs import (
    File "/_work/torch-mlir/torch-mlir/build/tools/torch-mlir/python_packages/torch_mlir/torch_mlir_e2e_test/configs/__init__.py", line 8, in <module>
      from .onnx_backend import OnnxBackendTestConfig
    File "/_work/torch-mlir/torch-mlir/build/tools/torch-mlir/python_packages/torch_mlir/torch_mlir_e2e_test/configs/onnx_backend.py", line 16, in <module>
      from torch_mlir_e2e_test.utils import convert_annotations_to_placeholders
    File "/_work/torch-mlir/torch-mlir/build/tools/torch-mlir/python_packages/torch_mlir/torch_mlir_e2e_test/utils.py", line 6, in <module>
      from torch_mlir.torchscript import TensorPlaceholder
    File "/_work/torch-mlir/torch-mlir/build/tools/torch-mlir/python_packages/torch_mlir/torch_mlir/torchscript.py", line 25, in <module>
      from torch_mlir.jit_ir_importer import ClassAnnotator, ImportOptions, ModuleBuilder
  ModuleNotFoundError: No module named 'torch_mlir.jit_ir_importer'
    File "/_work/torch-mlir/torch-mlir/build/tools/torch-mlir/python_packages/torch_mlir/torch_mlir_e2e_test/configs/__init__.py", line 9, in <module>
      from .torchdynamo import TorchDynamoTestConfig
    File "/_work/torch-mlir/torch-mlir/build/tools/torch-mlir/python_packages/torch_mlir/torch_mlir_e2e_test/configs/torchdynamo.py", line 25, in <module>
      from torch_mlir.torchscript import (
    File "/_work/torch-mlir/torch-mlir/build/tools/torch-mlir/python_packages/torch_mlir/torch_mlir/torchscript.py", line 25, in <module>
      from torch_mlir.jit_ir_importer import ClassAnnotator, ImportOptions, ModuleBuilder
  ModuleNotFoundError: No module named 'torch_mlir.jit_ir_importer'

Need to disable the onnx/onnx_tosa/torchdynamo e2etest as well, but this does not have replacement.

AmosLewis avatar Aug 21 '24 22:08 AmosLewis

main.py: error: argument -c/--config: invalid choice: 'linalg'
 (choose from 'native_torch', 'torchscript', 'lazy_tensor_core', 'onnx',
  'onnx_tosa', 'fx_importer', 'fx_importer_stablehlo', 'fx_importer_tosa')

Need to change the CI test_posix.sh If change the linalg/tosa/stablehlo to it's fx_import version, will got 5 failed tests("IsFloatingPointFloat_True", "IsFloatingPointInt_False", "ScalarConstantTupleModule_basic", "TorchPrimLoopForLikeModule_basic", "TorchPrimLoopWhileLikeModule_basic",) in torch-stable version but torch-nightly will pass. So, I just delete them in torch-stable version in test_posix.sh

AmosLewis avatar Aug 21 '24 23:08 AmosLewis

 Check that update_abstract_interp_lib.sh has been run
  /opt/python/cp311-cp311/bin/python: Error while finding module specification for 'torch_mlir.jit_ir_importer.build_tools.abstract_interp_lib_gen' (ModuleNotFoundError: No module named 'torch_mlir.jit_ir_importer')
  Error: Process completed with exit code 1.

Error in torch-nightly, need to delete the torch_mlir.jit_ir_importer in .sh file.

AmosLewis avatar Aug 22 '24 00:08 AmosLewis

@stellaraccident need a review.

AmosLewis avatar Aug 22 '24 00:08 AmosLewis

There's another weird thing about torch_mlir._mlir_libs._jit_ir_importer is that the lib is build empty, at least that's what I observe during my local build. And that does cause problems in macOS builds (not Linux!). More details are here: #3663

We should possibly remove this as part of this PR as well? https://github.com/llvm/torch-mlir/blob/9a6fe58a027d701eff6799e86a65535a8c2f3708/setup.py#L238-L242

@stellaraccident any opinion on that?

dbabokin avatar Aug 23 '24 22:08 dbabokin

There's another weird thing about torch_mlir._mlir_libs._jit_ir_importer is that the lib is build empty, at least that's what I observe during my local build. And that does cause problems in macOS builds (not Linux!). More details are here: #3663

We should possibly remove this as part of this PR as well?

https://github.com/llvm/torch-mlir/blob/9a6fe58a027d701eff6799e86a65535a8c2f3708/setup.py#L238-L242

@stellaraccident any opinion on that?

Yeah. Let's get the last two code generation things here separated and then do a full excision.

stellaraccident avatar Aug 24 '24 18:08 stellaraccident

Run Linalg e2e integration tests
  Traceback (most recent call last):
    File "<frozen runpy>", line 19[8](https://github.com/llvm/torch-mlir/actions/runs/10498026005/job/29082120990?pr=3654#step:8:9), in _run_module_as_main
    File "<frozen runpy>", line 88, in _run_code
    File "/_work/torch-mlir/torch-mlir/projects/pt1/e2e_testing/main.py", line 20, in <module>
      from torch_mlir_e2e_test.configs import (
    File "/_work/torch-mlir/torch-mlir/build/tools/torch-mlir/python_packages/torch_mlir/torch_mlir_e2e_test/configs/__init__.py", line 8, in <module>
      from .onnx_backend import OnnxBackendTestConfig
    File "/_work/torch-mlir/torch-mlir/build/tools/torch-mlir/python_packages/torch_mlir/torch_mlir_e2e_test/configs/onnx_backend.py", line 16, in <module>
      from torch_mlir_e2e_test.utils import convert_annotations_to_placeholders
    File "/_work/torch-mlir/torch-mlir/build/tools/torch-mlir/python_packages/torch_mlir/torch_mlir_e2e_test/utils.py", line 6, in <module>
      from torch_mlir.torchscript import TensorPlaceholder
    File "/_work/torch-mlir/torch-mlir/build/tools/torch-mlir/python_packages/torch_mlir/torch_mlir/torchscript.py", line 25, in <module>
      from torch_mlir.jit_ir_importer import ClassAnnotator, ImportOptions, ModuleBuilder
  ModuleNotFoundError: No module named 'torch_mlir.jit_ir_importer'
    File "/_work/torch-mlir/torch-mlir/build/tools/torch-mlir/python_packages/torch_mlir/torch_mlir_e2e_test/configs/__init__.py", line 9, in <module>
      from .torchdynamo import TorchDynamoTestConfig
    File "/_work/torch-mlir/torch-mlir/build/tools/torch-mlir/python_packages/torch_mlir/torch_mlir_e2e_test/configs/torchdynamo.py", line 25, in <module>
      from torch_mlir.torchscript import (
    File "/_work/torch-mlir/torch-mlir/build/tools/torch-mlir/python_packages/torch_mlir/torch_mlir/torchscript.py", line 25, in <module>
      from torch_mlir.jit_ir_importer import ClassAnnotator, ImportOptions, ModuleBuilder
  ModuleNotFoundError: No module named 'torch_mlir.jit_ir_importer'

Need to disable the onnx/onnx_tosa/torchdynamo e2etest as well, but this does not have replacement.

Just disable/remove it. I wasn't aware a new dep like this was added, and we've been quite clear we're moving away from this. Was probably just an oversight and using the wrong thing -- the folks doing that will need to upgrade.

stellaraccident avatar Aug 24 '24 18:08 stellaraccident

Run Linalg e2e integration tests
  Traceback (most recent call last):
    File "<frozen runpy>", line 19[8](https://github.com/llvm/torch-mlir/actions/runs/10498026005/job/29082120990?pr=3654#step:8:9), in _run_module_as_main
    File "<frozen runpy>", line 88, in _run_code
    File "/_work/torch-mlir/torch-mlir/projects/pt1/e2e_testing/main.py", line 20, in <module>
      from torch_mlir_e2e_test.configs import (
    File "/_work/torch-mlir/torch-mlir/build/tools/torch-mlir/python_packages/torch_mlir/torch_mlir_e2e_test/configs/__init__.py", line 8, in <module>
      from .onnx_backend import OnnxBackendTestConfig
    File "/_work/torch-mlir/torch-mlir/build/tools/torch-mlir/python_packages/torch_mlir/torch_mlir_e2e_test/configs/onnx_backend.py", line 16, in <module>
      from torch_mlir_e2e_test.utils import convert_annotations_to_placeholders
    File "/_work/torch-mlir/torch-mlir/build/tools/torch-mlir/python_packages/torch_mlir/torch_mlir_e2e_test/utils.py", line 6, in <module>
      from torch_mlir.torchscript import TensorPlaceholder
    File "/_work/torch-mlir/torch-mlir/build/tools/torch-mlir/python_packages/torch_mlir/torch_mlir/torchscript.py", line 25, in <module>
      from torch_mlir.jit_ir_importer import ClassAnnotator, ImportOptions, ModuleBuilder
  ModuleNotFoundError: No module named 'torch_mlir.jit_ir_importer'
    File "/_work/torch-mlir/torch-mlir/build/tools/torch-mlir/python_packages/torch_mlir/torch_mlir_e2e_test/configs/__init__.py", line 9, in <module>
      from .torchdynamo import TorchDynamoTestConfig
    File "/_work/torch-mlir/torch-mlir/build/tools/torch-mlir/python_packages/torch_mlir/torch_mlir_e2e_test/configs/torchdynamo.py", line 25, in <module>
      from torch_mlir.torchscript import (
    File "/_work/torch-mlir/torch-mlir/build/tools/torch-mlir/python_packages/torch_mlir/torch_mlir/torchscript.py", line 25, in <module>
      from torch_mlir.jit_ir_importer import ClassAnnotator, ImportOptions, ModuleBuilder
  ModuleNotFoundError: No module named 'torch_mlir.jit_ir_importer'

Need to disable the onnx/onnx_tosa/torchdynamo e2etest as well, but this does not have replacement.

In #3668 the symbols required by onnx e2etest have been extracted to a common interface. Now it should no longer depend on jit_ir_importer.

penguin-wwy avatar Aug 27 '24 15:08 penguin-wwy

Run Linalg e2e integration tests
  Traceback (most recent call last):
    File "<frozen runpy>", line 19[8](https://github.com/llvm/torch-mlir/actions/runs/10498026005/job/29082120990?pr=3654#step:8:9), in _run_module_as_main
    File "<frozen runpy>", line 88, in _run_code
    File "/_work/torch-mlir/torch-mlir/projects/pt1/e2e_testing/main.py", line 20, in <module>
      from torch_mlir_e2e_test.configs import (
    File "/_work/torch-mlir/torch-mlir/build/tools/torch-mlir/python_packages/torch_mlir/torch_mlir_e2e_test/configs/__init__.py", line 8, in <module>
      from .onnx_backend import OnnxBackendTestConfig
    File "/_work/torch-mlir/torch-mlir/build/tools/torch-mlir/python_packages/torch_mlir/torch_mlir_e2e_test/configs/onnx_backend.py", line 16, in <module>
      from torch_mlir_e2e_test.utils import convert_annotations_to_placeholders
    File "/_work/torch-mlir/torch-mlir/build/tools/torch-mlir/python_packages/torch_mlir/torch_mlir_e2e_test/utils.py", line 6, in <module>
      from torch_mlir.torchscript import TensorPlaceholder
    File "/_work/torch-mlir/torch-mlir/build/tools/torch-mlir/python_packages/torch_mlir/torch_mlir/torchscript.py", line 25, in <module>
      from torch_mlir.jit_ir_importer import ClassAnnotator, ImportOptions, ModuleBuilder
  ModuleNotFoundError: No module named 'torch_mlir.jit_ir_importer'
    File "/_work/torch-mlir/torch-mlir/build/tools/torch-mlir/python_packages/torch_mlir/torch_mlir_e2e_test/configs/__init__.py", line 9, in <module>
      from .torchdynamo import TorchDynamoTestConfig
    File "/_work/torch-mlir/torch-mlir/build/tools/torch-mlir/python_packages/torch_mlir/torch_mlir_e2e_test/configs/torchdynamo.py", line 25, in <module>
      from torch_mlir.torchscript import (
    File "/_work/torch-mlir/torch-mlir/build/tools/torch-mlir/python_packages/torch_mlir/torch_mlir/torchscript.py", line 25, in <module>
      from torch_mlir.jit_ir_importer import ClassAnnotator, ImportOptions, ModuleBuilder
  ModuleNotFoundError: No module named 'torch_mlir.jit_ir_importer'

Need to disable the onnx/onnx_tosa/torchdynamo e2etest as well, but this does not have replacement.

In #3668 the symbols required by onnx e2etest have been extracted to a common interface. Now it should no longer depend on jit_ir_importer.

Than you for catching/fixing that. I had missed it.

stellaraccident avatar Aug 27 '24 15:08 stellaraccident

I think we need to come up with a replacement for update_torch_ods.sh and possibly update_abstract_interp_lib.sh before landing this. I believe that it is mostly historical that both of them rely on that one method in the JitIR extension to get the op registry, and I believe there are more direct ways to go about that these days. Been on my list for a very long time to research this... If I recall the method they rely on is just using a C++ API to get all of the schemas and then putting them together into a JSON struct for the code generators to use. There may be a comparative API on the Python side these days, or worst case, we could just parse the op definition yaml files like PyTorch itself does. Probably not a lot of work but may take some digging.

@stellaraccident Based on the code reading, for both .sh I got the call path is :

  1. update_torch_ods.sh -> torch_ods_gen.py -> registry.py Registry.load() -> get_registered_ops.cpp getRegisteredOps() -> pytorch/torch/csrc/jit/runtime/operator.cpp torch::jit::getAllOperators()

  2. update_abstract_interp_lib.sh -> abstract_interp_lib_gen.py -> library_generator.py -> registry.py Registry.load() -> get_registered_ops.cpp getRegisteredOps() -> pytorch/torch/csrc/jit/runtime/operator.cpp torch::jit::getAllOperators()

I guess the one method in the JitIR extension to get the op registry you mentioned is torch::jit::getAllOperators(). A comparative API on the Python side probably the torch._C._jit_get_all_schemas()? With these materials, since people still use these .sh, we can rewrite the get_registered_ops.cpp with python code and call it in reigester.py. Do you think it would be good if we move the build_tools outside the jit_ir_importer/?

Another issue you didn't mentioned is in the call path: 3. update_abstract_interp_lib.sh -> abstract_interp_lib_gen.py -> library_generator.py -> module_builder.h Are we also going to rewrite the module_builder with other comparative API on the Python side? Since we want to get rid of building all the .h/cpp files under csrc/jit_ir_importer/.

AmosLewis avatar Sep 06 '24 10:09 AmosLewis

pure python ods implementation will be in PR https://github.com/llvm/torch-mlir/pull/3780

AmosLewis avatar Oct 10 '24 15:10 AmosLewis