SHARK icon indicating copy to clipboard operation
SHARK copied to clipboard

DistilBert fails to lower through torch-mlir pass pipeline (illegal ops)

Open monorimet opened this issue 3 years ago • 4 comments

Error output:

error: failed to legalize operation 'torch.aten.view' that was explicitly marked illegal
note: see current operation: %416 = "torch.aten.view"(%414, %415) : (!torch.vtensor<[?,?,768],f32>, !torch.list<int>) -> !torch.vtensor<[?,?,12,64],f32>                                                                                   
Traceback (most recent call last):
  File "/home/ean/SHARK/generate_sharktank.py", line 180, in <module>
    save_torch_model(args.torch_model_csv)
  File "/home/ean/SHARK/generate_sharktank.py", line 68, in save_torch_model
    mlir_importer.import_debug(
  File "/home/ean/SHARK/shark/shark_importer.py", line 163, in import_debug
    imported_mlir = self.import_mlir(
  File "/home/ean/SHARK/shark/shark_importer.py", line 109, in import_mlir
    return self._torch_mlir(is_dynamic, tracing_required), func_name
  File "/home/ean/SHARK/shark/shark_importer.py", line 74, in _torch_mlir
    return get_torch_mlir_module(
  File "/home/ean/SHARK/shark/torch_mlir_utils.py", line 150, in get_torch_mlir_module
    pm.run(mb.module)
RuntimeError: Failure while executing pass pipeline.

Reproduce:

  • add distilbert-base-uncased,True,hf to tank/pytorch/torch_model_list.csv
  • run python generate_sharktank.py

monorimet avatar Jul 28 '22 22:07 monorimet

@monorimet Could you try with this PR https://github.com/llvm/torch-mlir/pull/1168 .The problem is with dynamic shapes.

pashu123 avatar Aug 11 '22 05:08 pashu123

Also, you need the latest torch-mlir with this commit https://github.com/llvm/torch-mlir/commit/b1a506624ce64a8f3ae2f878d5d35cd3f0dfae1d

pashu123 avatar Aug 11 '22 05:08 pashu123

Traceback (most recent call last):
  File "/home/ean/SHARK/generate_sharktank.py", line 222, in <module>
    save_torch_model(args.torch_model_csv)
  File "/home/ean/SHARK/generate_sharktank.py", line 75, in save_torch_model
    mlir_importer.import_debug(
  File "/home/ean/SHARK/shark/shark_importer.py", line 163, in import_debug
    imported_mlir = self.import_mlir(
  File "/home/ean/SHARK/shark/shark_importer.py", line 109, in import_mlir
    return self._torch_mlir(is_dynamic, tracing_required), func_name
  File "/home/ean/SHARK/shark/shark_importer.py", line 74, in _torch_mlir
    return get_torch_mlir_module(
  File "/home/ean/SHARK/shark/torch_mlir_utils.py", line 65, in get_torch_mlir_module
    module = torch_mlir.compile(
  File "/home/ean/SHARK/shark.venv/lib/python3.10/site-packages/torch_mlir/__init__.py", line 217, in compile
    run_pipeline_with_repro_report(mb.module,
  File "/home/ean/SHARK/shark.venv/lib/python3.10/site-packages/torch_mlir/compiler_utils.py", line 73, in run_pipeline_with_repro_report
    raise TorchMlirCompilerError(trimmed_message) from None
torch_mlir.compiler_utils.TorchMlirCompilerError: Lowering TorchScript IR -> Torch Backend IR failed with the following diagnostics:
error: unsupported by backend contract: type '!torch.Device'
note: see current operation: %126 = "torch.constant.device"() {value = "cpu"} : () -> !torch.Device

Didn't try on the given PR but the error is different now. Any ideas?

monorimet avatar Aug 18 '22 22:08 monorimet

This seems related to the error @gpetters94 is also having https://discord.com/channels/636084430946959380/742573221882364009/1009709228950294529. It would be useful to dump the MLIR. It's possible that the issue is that we don't have a decomposition/canonicalization/folder for an op that takes a !torch.Device as input, resulting in that variable not getting removed.

ramiro050 avatar Aug 18 '22 22:08 ramiro050