Disable TORCH_MLIR_ENABLE_PYTORCH_EXTENSIONS
Related discussion: https://discourse.llvm.org/t/drastically-reducing-documented-scope-of-project/80484
To fix the no module named 'torch_mlir._mlir_libs._jit_ir_importer' bug when test pytorch related models like pytorch/models/mit-b0
python ./run.py --tolerance 0.001 0.001 --cachedir /proj/gdba/shark/cache --ireebuild ../../iree-build -f pytorch -g models --mode onnx --report --tests pytorch/models/mit-b0
Starting e2eshark tests. Using 4 processes
Cache Directory: /proj/gdba/shark/cache
Tolerance for comparing floating point (atol, rtol) = (0.001, 0.001)
Note: No Torch MLIR build provided using --torchmlirbuild. iree-import-onnx will be used to convert onnx to torch onnx mlir
IREE build: /proj/gdba/shark/chi/src/iree-build
Test run directory: /proj/gdba/shark/chi/src/SHARK-TestSuite/e2eshark/test-run
Since --tests or --testsfile was specified, --groups ignored
Framework:pytorch mode=onnx backend=llvm-cpu runfrom=model-run runupto=inference
Test list: ['pytorch/models/mit-b0']
Test pytorch/models/mit-b0 failed [model-run]
Generated status report /proj/gdba/shark/chi/src/SHARK-TestSuite/e2eshark/test-run/statusreport.md
Generated time report /proj/gdba/shark/chi/src/SHARK-TestSuite/e2eshark/test-run/timereport.md
Generated summary report /proj/gdba/shark/chi/src/SHARK-TestSuite/e2eshark/test-run/summaryreport.md
If I use :
pip install \
--find-links https://github.com/llvm/torch-mlir-release/releases/expanded_assets/dev-wheels \
--upgrade \
torch-mlir
pip list:
torch-mlir 20240820.189
Error in model-run.log
python runmodel.py --torchmlirimport fximport --todtype default --mode onnx --outfileprefix mit-b0 1> model-run.log 2>&1
...
Traceback (most recent call last):
File "/proj/gdba/shark/chi/src/SHARK-TestSuite/e2eshark/test-run/pytorch/models/mit-b0/runmodel.py", line 78, in <module>
from torch_mlir import torchscript
File "/proj/gdba/shark/chi/src/SHARK-TestSuite/e2eshark/e2e_venv/lib/python3.10/site-packages/torch_mlir/torchscript.py", line 25, in <module>
from torch_mlir.jit_ir_importer import ClassAnnotator, ImportOptions, ModuleBuilder
File "/proj/gdba/shark/chi/src/SHARK-TestSuite/e2eshark/e2e_venv/lib/python3.10/site-packages/torch_mlir/jit_ir_importer/__init__.py", line 14, in <module>
from .._mlir_libs._jit_ir_importer import *
ModuleNotFoundError: No module named 'torch_mlir._mlir_libs._jit_ir_importer'
The error disappear if I pip uninstall torch-mlir and use my local build 0820 with this patch export PYTHONPATH=${PYTHONPATH}:/proj/gdba/shark/chi/src/torch-mlir/build/tools/torch-mlir/python_packages/torch_mlir
Get an error No module named 'torch_mlir_e2e_test'
FAIL: TORCH_MLIR :: python/fx_importer/sparsity/sparse_test.py (95 of 105)
******************** TEST 'TORCH_MLIR :: python/fx_importer/sparsity/sparse_test.py' FAILED ********************
Exit Code: 2
Command Output (stderr):
--
RUN: at line 6: /opt/python/cp311-cp311/bin/python /_work/torch-mlir/torch-mlir/test/python/fx_importer/sparsity/sparse_test.py | FileCheck /_work/torch-mlir/torch-mlir/test/python/fx_importer/sparsity/sparse_test.py
+ FileCheck /_work/torch-mlir/torch-mlir/test/python/fx_importer/sparsity/sparse_test.py
+ /opt/python/cp311-cp311/bin/python /_work/torch-mlir/torch-mlir/test/python/fx_importer/sparsity/sparse_test.py
Traceback (most recent call last):
File "/_work/torch-mlir/torch-mlir/test/python/fx_importer/sparsity/sparse_test.py", line 19, in <module>
from torch_mlir_e2e_test.linalg_on_tensors_backends.refbackend import (
ModuleNotFoundError: No module named 'torch_mlir_e2e_test'
FileCheck error: '<stdin>' is empty.
FileCheck command line: FileCheck /_work/torch-mlir/torch-mlir/test/python/fx_importer/sparsity/sparse_test.py
@stellaraccident Any suggestions, should we just disable the TORCH_MLIR_ENABLE_JIT_IR_IMPORTER?
Yeah, you can try that. I think we could also just move the e2e test tools out of that and into the main project. Would need to look at it some more.
Yeah, you can try that. I think we could also just move the e2e test tools out of that and into the main project. Would need to look at it some more.
Run Linalg e2e integration tests
Traceback (most recent call last):
File "<frozen runpy>", line 198, in _run_module_as_main
File "<frozen runpy>", line 88, in _run_code
File "/_work/torch-mlir/torch-mlir/projects/pt1/e2e_testing/main.py", line 20, in <module>
from torch_mlir_e2e_test.configs import (
File "/_work/torch-mlir/torch-mlir/build/tools/torch-mlir/python_packages/torch_mlir/torch_mlir_e2e_test/configs/__init__.py", line [7](https://github.com/llvm/torch-mlir/actions/runs/10482533744/job/29033805586?pr=3654#step:8:8), in <module>
from .linalg_on_tensors_backend import LinalgOnTensorsBackendTestConfig
File "/_work/torch-mlir/torch-mlir/build/tools/torch-mlir/python_packages/torch_mlir/torch_mlir_e2e_test/configs/linalg_on_tensors_backend.py", line [9](https://github.com/llvm/torch-mlir/actions/runs/10482533744/job/29033805586?pr=3654#step:8:10), in <module>
from torch_mlir import torchscript
File "/_work/torch-mlir/torch-mlir/build/tools/torch-mlir/python_packages/torch_mlir/torch_mlir/torchscript.py", line 25, in <module>
from torch_mlir.jit_ir_importer import ClassAnnotator, ImportOptions, ModuleBuilder
ModuleNotFoundError: No module named 'torch_mlir.jit_ir_importer'
@stellaraccident The tosa/stablehlo/linalg default e2e test will use the from torch_mlir import torchscript during test config, so it will lead to CI test fail. Is it ok to just delete all of these 3 default e2e tests since we already have the fx_importer/fx_importer_stablehlo/fx_importer_tosa?
Yes, go ahead and delete. We can't support the old thing anymore and have replacements.
Run Linalg e2e integration tests
Traceback (most recent call last):
File "<frozen runpy>", line 19[8](https://github.com/llvm/torch-mlir/actions/runs/10498026005/job/29082120990?pr=3654#step:8:9), in _run_module_as_main
File "<frozen runpy>", line 88, in _run_code
File "/_work/torch-mlir/torch-mlir/projects/pt1/e2e_testing/main.py", line 20, in <module>
from torch_mlir_e2e_test.configs import (
File "/_work/torch-mlir/torch-mlir/build/tools/torch-mlir/python_packages/torch_mlir/torch_mlir_e2e_test/configs/__init__.py", line 8, in <module>
from .onnx_backend import OnnxBackendTestConfig
File "/_work/torch-mlir/torch-mlir/build/tools/torch-mlir/python_packages/torch_mlir/torch_mlir_e2e_test/configs/onnx_backend.py", line 16, in <module>
from torch_mlir_e2e_test.utils import convert_annotations_to_placeholders
File "/_work/torch-mlir/torch-mlir/build/tools/torch-mlir/python_packages/torch_mlir/torch_mlir_e2e_test/utils.py", line 6, in <module>
from torch_mlir.torchscript import TensorPlaceholder
File "/_work/torch-mlir/torch-mlir/build/tools/torch-mlir/python_packages/torch_mlir/torch_mlir/torchscript.py", line 25, in <module>
from torch_mlir.jit_ir_importer import ClassAnnotator, ImportOptions, ModuleBuilder
ModuleNotFoundError: No module named 'torch_mlir.jit_ir_importer'
File "/_work/torch-mlir/torch-mlir/build/tools/torch-mlir/python_packages/torch_mlir/torch_mlir_e2e_test/configs/__init__.py", line 9, in <module>
from .torchdynamo import TorchDynamoTestConfig
File "/_work/torch-mlir/torch-mlir/build/tools/torch-mlir/python_packages/torch_mlir/torch_mlir_e2e_test/configs/torchdynamo.py", line 25, in <module>
from torch_mlir.torchscript import (
File "/_work/torch-mlir/torch-mlir/build/tools/torch-mlir/python_packages/torch_mlir/torch_mlir/torchscript.py", line 25, in <module>
from torch_mlir.jit_ir_importer import ClassAnnotator, ImportOptions, ModuleBuilder
ModuleNotFoundError: No module named 'torch_mlir.jit_ir_importer'
Need to disable the onnx/onnx_tosa/torchdynamo e2etest as well, but this does not have replacement.
main.py: error: argument -c/--config: invalid choice: 'linalg'
(choose from 'native_torch', 'torchscript', 'lazy_tensor_core', 'onnx',
'onnx_tosa', 'fx_importer', 'fx_importer_stablehlo', 'fx_importer_tosa')
Need to change the CI test_posix.sh If change the linalg/tosa/stablehlo to it's fx_import version, will got 5 failed tests("IsFloatingPointFloat_True", "IsFloatingPointInt_False", "ScalarConstantTupleModule_basic", "TorchPrimLoopForLikeModule_basic", "TorchPrimLoopWhileLikeModule_basic",) in torch-stable version but torch-nightly will pass. So, I just delete them in torch-stable version in test_posix.sh
Check that update_abstract_interp_lib.sh has been run
/opt/python/cp311-cp311/bin/python: Error while finding module specification for 'torch_mlir.jit_ir_importer.build_tools.abstract_interp_lib_gen' (ModuleNotFoundError: No module named 'torch_mlir.jit_ir_importer')
Error: Process completed with exit code 1.
Error in torch-nightly, need to delete the torch_mlir.jit_ir_importer in .sh file.
@stellaraccident need a review.
There's another weird thing about torch_mlir._mlir_libs._jit_ir_importer is that the lib is build empty, at least that's what I observe during my local build. And that does cause problems in macOS builds (not Linux!). More details are here: #3663
We should possibly remove this as part of this PR as well? https://github.com/llvm/torch-mlir/blob/9a6fe58a027d701eff6799e86a65535a8c2f3708/setup.py#L238-L242
@stellaraccident any opinion on that?
There's another weird thing about
torch_mlir._mlir_libs._jit_ir_importeris that the lib is build empty, at least that's what I observe during my local build. And that does cause problems in macOS builds (not Linux!). More details are here: #3663We should possibly remove this as part of this PR as well?
https://github.com/llvm/torch-mlir/blob/9a6fe58a027d701eff6799e86a65535a8c2f3708/setup.py#L238-L242
@stellaraccident any opinion on that?
Yeah. Let's get the last two code generation things here separated and then do a full excision.
Run Linalg e2e integration tests Traceback (most recent call last): File "<frozen runpy>", line 19[8](https://github.com/llvm/torch-mlir/actions/runs/10498026005/job/29082120990?pr=3654#step:8:9), in _run_module_as_main File "<frozen runpy>", line 88, in _run_code File "/_work/torch-mlir/torch-mlir/projects/pt1/e2e_testing/main.py", line 20, in <module> from torch_mlir_e2e_test.configs import ( File "/_work/torch-mlir/torch-mlir/build/tools/torch-mlir/python_packages/torch_mlir/torch_mlir_e2e_test/configs/__init__.py", line 8, in <module> from .onnx_backend import OnnxBackendTestConfig File "/_work/torch-mlir/torch-mlir/build/tools/torch-mlir/python_packages/torch_mlir/torch_mlir_e2e_test/configs/onnx_backend.py", line 16, in <module> from torch_mlir_e2e_test.utils import convert_annotations_to_placeholders File "/_work/torch-mlir/torch-mlir/build/tools/torch-mlir/python_packages/torch_mlir/torch_mlir_e2e_test/utils.py", line 6, in <module> from torch_mlir.torchscript import TensorPlaceholder File "/_work/torch-mlir/torch-mlir/build/tools/torch-mlir/python_packages/torch_mlir/torch_mlir/torchscript.py", line 25, in <module> from torch_mlir.jit_ir_importer import ClassAnnotator, ImportOptions, ModuleBuilder ModuleNotFoundError: No module named 'torch_mlir.jit_ir_importer'File "/_work/torch-mlir/torch-mlir/build/tools/torch-mlir/python_packages/torch_mlir/torch_mlir_e2e_test/configs/__init__.py", line 9, in <module> from .torchdynamo import TorchDynamoTestConfig File "/_work/torch-mlir/torch-mlir/build/tools/torch-mlir/python_packages/torch_mlir/torch_mlir_e2e_test/configs/torchdynamo.py", line 25, in <module> from torch_mlir.torchscript import ( File "/_work/torch-mlir/torch-mlir/build/tools/torch-mlir/python_packages/torch_mlir/torch_mlir/torchscript.py", line 25, in <module> from torch_mlir.jit_ir_importer import ClassAnnotator, ImportOptions, ModuleBuilder ModuleNotFoundError: No module named 'torch_mlir.jit_ir_importer'Need to disable the onnx/onnx_tosa/torchdynamo e2etest as well, but this does not have replacement.
Just disable/remove it. I wasn't aware a new dep like this was added, and we've been quite clear we're moving away from this. Was probably just an oversight and using the wrong thing -- the folks doing that will need to upgrade.
Run Linalg e2e integration tests Traceback (most recent call last): File "<frozen runpy>", line 19[8](https://github.com/llvm/torch-mlir/actions/runs/10498026005/job/29082120990?pr=3654#step:8:9), in _run_module_as_main File "<frozen runpy>", line 88, in _run_code File "/_work/torch-mlir/torch-mlir/projects/pt1/e2e_testing/main.py", line 20, in <module> from torch_mlir_e2e_test.configs import ( File "/_work/torch-mlir/torch-mlir/build/tools/torch-mlir/python_packages/torch_mlir/torch_mlir_e2e_test/configs/__init__.py", line 8, in <module> from .onnx_backend import OnnxBackendTestConfig File "/_work/torch-mlir/torch-mlir/build/tools/torch-mlir/python_packages/torch_mlir/torch_mlir_e2e_test/configs/onnx_backend.py", line 16, in <module> from torch_mlir_e2e_test.utils import convert_annotations_to_placeholders File "/_work/torch-mlir/torch-mlir/build/tools/torch-mlir/python_packages/torch_mlir/torch_mlir_e2e_test/utils.py", line 6, in <module> from torch_mlir.torchscript import TensorPlaceholder File "/_work/torch-mlir/torch-mlir/build/tools/torch-mlir/python_packages/torch_mlir/torch_mlir/torchscript.py", line 25, in <module> from torch_mlir.jit_ir_importer import ClassAnnotator, ImportOptions, ModuleBuilder ModuleNotFoundError: No module named 'torch_mlir.jit_ir_importer'File "/_work/torch-mlir/torch-mlir/build/tools/torch-mlir/python_packages/torch_mlir/torch_mlir_e2e_test/configs/__init__.py", line 9, in <module> from .torchdynamo import TorchDynamoTestConfig File "/_work/torch-mlir/torch-mlir/build/tools/torch-mlir/python_packages/torch_mlir/torch_mlir_e2e_test/configs/torchdynamo.py", line 25, in <module> from torch_mlir.torchscript import ( File "/_work/torch-mlir/torch-mlir/build/tools/torch-mlir/python_packages/torch_mlir/torch_mlir/torchscript.py", line 25, in <module> from torch_mlir.jit_ir_importer import ClassAnnotator, ImportOptions, ModuleBuilder ModuleNotFoundError: No module named 'torch_mlir.jit_ir_importer'Need to disable the onnx/onnx_tosa/torchdynamo e2etest as well, but this does not have replacement.
In #3668 the symbols required by onnx e2etest have been extracted to a common interface. Now it should no longer depend on jit_ir_importer.
Run Linalg e2e integration tests Traceback (most recent call last): File "<frozen runpy>", line 19[8](https://github.com/llvm/torch-mlir/actions/runs/10498026005/job/29082120990?pr=3654#step:8:9), in _run_module_as_main File "<frozen runpy>", line 88, in _run_code File "/_work/torch-mlir/torch-mlir/projects/pt1/e2e_testing/main.py", line 20, in <module> from torch_mlir_e2e_test.configs import ( File "/_work/torch-mlir/torch-mlir/build/tools/torch-mlir/python_packages/torch_mlir/torch_mlir_e2e_test/configs/__init__.py", line 8, in <module> from .onnx_backend import OnnxBackendTestConfig File "/_work/torch-mlir/torch-mlir/build/tools/torch-mlir/python_packages/torch_mlir/torch_mlir_e2e_test/configs/onnx_backend.py", line 16, in <module> from torch_mlir_e2e_test.utils import convert_annotations_to_placeholders File "/_work/torch-mlir/torch-mlir/build/tools/torch-mlir/python_packages/torch_mlir/torch_mlir_e2e_test/utils.py", line 6, in <module> from torch_mlir.torchscript import TensorPlaceholder File "/_work/torch-mlir/torch-mlir/build/tools/torch-mlir/python_packages/torch_mlir/torch_mlir/torchscript.py", line 25, in <module> from torch_mlir.jit_ir_importer import ClassAnnotator, ImportOptions, ModuleBuilder ModuleNotFoundError: No module named 'torch_mlir.jit_ir_importer'File "/_work/torch-mlir/torch-mlir/build/tools/torch-mlir/python_packages/torch_mlir/torch_mlir_e2e_test/configs/__init__.py", line 9, in <module> from .torchdynamo import TorchDynamoTestConfig File "/_work/torch-mlir/torch-mlir/build/tools/torch-mlir/python_packages/torch_mlir/torch_mlir_e2e_test/configs/torchdynamo.py", line 25, in <module> from torch_mlir.torchscript import ( File "/_work/torch-mlir/torch-mlir/build/tools/torch-mlir/python_packages/torch_mlir/torch_mlir/torchscript.py", line 25, in <module> from torch_mlir.jit_ir_importer import ClassAnnotator, ImportOptions, ModuleBuilder ModuleNotFoundError: No module named 'torch_mlir.jit_ir_importer'Need to disable the onnx/onnx_tosa/torchdynamo e2etest as well, but this does not have replacement.
In #3668 the symbols required by onnx e2etest have been extracted to a common interface. Now it should no longer depend on jit_ir_importer.
Than you for catching/fixing that. I had missed it.
I think we need to come up with a replacement for update_torch_ods.sh and possibly update_abstract_interp_lib.sh before landing this. I believe that it is mostly historical that both of them rely on that one method in the JitIR extension to get the op registry, and I believe there are more direct ways to go about that these days. Been on my list for a very long time to research this... If I recall the method they rely on is just using a C++ API to get all of the schemas and then putting them together into a JSON struct for the code generators to use. There may be a comparative API on the Python side these days, or worst case, we could just parse the op definition yaml files like PyTorch itself does. Probably not a lot of work but may take some digging.
@stellaraccident Based on the code reading, for both .sh I got the call path is :
-
update_torch_ods.sh -> torch_ods_gen.py -> registry.py Registry.load() -> get_registered_ops.cpp getRegisteredOps() -> pytorch/torch/csrc/jit/runtime/operator.cpp torch::jit::getAllOperators()
-
update_abstract_interp_lib.sh -> abstract_interp_lib_gen.py -> library_generator.py -> registry.py Registry.load() -> get_registered_ops.cpp getRegisteredOps() -> pytorch/torch/csrc/jit/runtime/operator.cpp torch::jit::getAllOperators()
I guess the one method in the JitIR extension to get the op registry you mentioned is torch::jit::getAllOperators(). A comparative API on the Python side probably the torch._C._jit_get_all_schemas()?
With these materials, since people still use these .sh, we can rewrite the get_registered_ops.cpp with python code and call it in reigester.py. Do you think it would be good if we move the build_tools outside the jit_ir_importer/?
Another issue you didn't mentioned is in the call path:
3. update_abstract_interp_lib.sh -> abstract_interp_lib_gen.py -> library_generator.py -> module_builder.h
Are we also going to rewrite the module_builder with other comparative API on the Python side? Since we want to get rid of building all the .h/cpp files under csrc/jit_ir_importer/.
pure python ods implementation will be in PR https://github.com/llvm/torch-mlir/pull/3780