openvino
openvino copied to clipboard
[Bug]: `AttributeError: 'SymInt' object has no attribute 'size'` error when using `torch.compile` for LLaVA model
OpenVINO Version
2023.3
Operating System
Fedora Silverblue 39
Device used for inference
GPU
Framework
PyTorch
Model used
llava-hf/llava-1.5-7b-hf
Issue description
Hello, I'm trying to use openvino with torch.compile to run inference of a LLaVA model with following code:
from PIL import Image
import torch
from transformers import AutoProcessor, LlavaForConditionalGeneration, BatchFeature
import openvino.torch
model_id = "/mnt/external2/LLMs/llava-1.5-7b-hf"
processor = AutoProcessor.from_pretrained(model_id)
prompt = "<image>\n"
prompt += "USER: What are the things I should be cautious about when I visit this place?\nASSISTANT:"
image_file = "./view.jpg"
model = LlavaForConditionalGeneration.from_pretrained(model_id, low_cpu_mem_usage=True).eval()
print("Compiling...")
model.generate = torch.compile(model.generate, backend="openvino",
options = {"device": "GPU.1", "model_caching": True})
raw_image = Image.open(image_file)
inputs = processor(prompt, raw_image, return_tensors='pt')
print("Generating...")
output = model.generate(**inputs, max_new_tokens=200)
print("Decoding...")
print(processor.decode(output[0][2:], skip_special_tokens=True))
and it will print the following error:
[2024-01-25 16:10:26,531] [24/1] torch._dynamo.output_graph.__graph_code: [DEBUG] TRACED GRAPH
[2024-01-25 16:10:26,531] [24/1] torch._dynamo.output_graph.__graph_code: [DEBUG] ===== __compiled_fn_35 =====
[2024-01-25 16:10:26,531] [24/1] torch._dynamo.output_graph.__graph_code: [DEBUG] <eval_with_key>.93 class GraphModule(torch.nn.Module):
[2024-01-25 16:10:26,531] [24/1] torch._dynamo.output_graph.__graph_code: [DEBUG] def forward(self, s0 : torch.SymInt, L_input_ids_ : torch.Tensor, s1 : torch.SymInt, L_attention_mask_ : torch.Tensor):
[2024-01-25 16:10:26,531] [24/1] torch._dynamo.output_graph.__graph_code: [DEBUG] l_input_ids_ = L_input_ids_
[2024-01-25 16:10:26,531] [24/1] torch._dynamo.output_graph.__graph_code: [DEBUG] l_attention_mask_ = L_attention_mask_
[2024-01-25 16:10:26,531] [24/1] torch._dynamo.output_graph.__graph_code: [DEBUG]
[2024-01-25 16:10:26,531] [24/1] torch._dynamo.output_graph.__graph_code: [DEBUG] # File: /mnt/data/podman/.conda/envs/cogvlm/lib/python3.11/site-packages/transformers/models/llava/modeling_llava.py:524, code: if attention_mask is not None and attention_mask.shape[1] > input_ids.shape[1]:
[2024-01-25 16:10:26,531] [24/1] torch._dynamo.output_graph.__graph_code: [DEBUG] size = l_attention_mask_.size(); l_attention_mask_ = None
[2024-01-25 16:10:26,531] [24/1] torch._dynamo.output_graph.__graph_code: [DEBUG] getitem_1 = size[1]; size = None
[2024-01-25 16:10:26,531] [24/1] torch._dynamo.output_graph.__graph_code: [DEBUG] size_1 = l_input_ids_.size()
[2024-01-25 16:10:26,531] [24/1] torch._dynamo.output_graph.__graph_code: [DEBUG] getitem_3 = size_1[1]; size_1 = None
[2024-01-25 16:10:26,531] [24/1] torch._dynamo.output_graph.__graph_code: [DEBUG] gt = getitem_1 > getitem_3; getitem_1 = getitem_3 = None
[2024-01-25 16:10:26,531] [24/1] torch._dynamo.output_graph.__graph_code: [DEBUG]
[2024-01-25 16:10:26,531] [24/1] torch._dynamo.output_graph.__graph_code: [DEBUG] # File: /mnt/data/podman/.conda/envs/cogvlm/lib/python3.11/site-packages/transformers/models/llava/modeling_llava.py:528, code: elif past_length < input_ids.shape[1]:
[2024-01-25 16:10:26,531] [24/1] torch._dynamo.output_graph.__graph_code: [DEBUG] size_2 = l_input_ids_.size(); l_input_ids_ = None
[2024-01-25 16:10:26,531] [24/1] torch._dynamo.output_graph.__graph_code: [DEBUG] getitem_5 = size_2[1]; size_2 = None
[2024-01-25 16:10:26,531] [24/1] torch._dynamo.output_graph.__graph_code: [DEBUG] gt_1 = getitem_5 > 603; getitem_5 = None
[2024-01-25 16:10:26,531] [24/1] torch._dynamo.output_graph.__graph_code: [DEBUG] return ()
[2024-01-25 16:10:26,531] [24/1] torch._dynamo.output_graph.__graph_code: [DEBUG]
[2024-01-25 16:10:26,531] [24/1] torch._dynamo.output_graph.__graph_code: [DEBUG]
[2024-01-25 16:10:26,532] [24/1] torch._dynamo.output_graph.__graph: [DEBUG] TRACED GRAPH
[2024-01-25 16:10:26,532] [24/1] torch._dynamo.output_graph.__graph: [DEBUG] __compiled_fn_35 <eval_with_key>.93 opcode name target args kwargs
[2024-01-25 16:10:26,532] [24/1] torch._dynamo.output_graph.__graph: [DEBUG] ------------- ----------------- --------------------------- ---------------------- --------
[2024-01-25 16:10:26,532] [24/1] torch._dynamo.output_graph.__graph: [DEBUG] placeholder s0 s0 () {}
[2024-01-25 16:10:26,532] [24/1] torch._dynamo.output_graph.__graph: [DEBUG] placeholder l_input_ids_ L_input_ids_ () {}
[2024-01-25 16:10:26,532] [24/1] torch._dynamo.output_graph.__graph: [DEBUG] placeholder s1 s1 () {}
[2024-01-25 16:10:26,532] [24/1] torch._dynamo.output_graph.__graph: [DEBUG] placeholder l_attention_mask_ L_attention_mask_ () {}
[2024-01-25 16:10:26,532] [24/1] torch._dynamo.output_graph.__graph: [DEBUG] call_method size size (l_attention_mask_,) {}
[2024-01-25 16:10:26,532] [24/1] torch._dynamo.output_graph.__graph: [DEBUG] call_function getitem_1 <built-in function getitem> (size, 1) {}
[2024-01-25 16:10:26,532] [24/1] torch._dynamo.output_graph.__graph: [DEBUG] call_method size_1 size (l_input_ids_,) {}
[2024-01-25 16:10:26,532] [24/1] torch._dynamo.output_graph.__graph: [DEBUG] call_function getitem_3 <built-in function getitem> (size_1, 1) {}
[2024-01-25 16:10:26,532] [24/1] torch._dynamo.output_graph.__graph: [DEBUG] call_function gt <built-in function gt> (getitem_1, getitem_3) {}
[2024-01-25 16:10:26,532] [24/1] torch._dynamo.output_graph.__graph: [DEBUG] call_method size_2 size (l_input_ids_,) {}
[2024-01-25 16:10:26,532] [24/1] torch._dynamo.output_graph.__graph: [DEBUG] call_function getitem_5 <built-in function getitem> (size_2, 1) {}
[2024-01-25 16:10:26,532] [24/1] torch._dynamo.output_graph.__graph: [DEBUG] call_function gt_1 <built-in function gt> (getitem_5, 603) {}
[2024-01-25 16:10:26,532] [24/1] torch._dynamo.output_graph.__graph: [DEBUG] output output output ((),) {}
[2024-01-25 16:10:26,532] [24/1] torch._dynamo.output_graph.__graph: [DEBUG]
[2024-01-25 16:10:26,534] [24/1] torch._dynamo.output_graph.__graph_sizes: [DEBUG] TRACED GRAPH TENSOR SIZES
[2024-01-25 16:10:26,534] [24/1] torch._dynamo.output_graph.__graph_sizes: [DEBUG] ===== __compiled_fn_35 =====
[2024-01-25 16:10:26,534] [24/1] torch._dynamo.output_graph.__graph_sizes: [DEBUG] l_input_ids_: (1, s0)
[2024-01-25 16:10:26,534] [24/1] torch._dynamo.output_graph.__graph_sizes: [DEBUG] l_input_ids_ (concrete): (1, 29)
[2024-01-25 16:10:26,534] [24/1] torch._dynamo.output_graph.__graph_sizes: [DEBUG] l_attention_mask_: (1, s1)
[2024-01-25 16:10:26,534] [24/1] torch._dynamo.output_graph.__graph_sizes: [DEBUG] l_attention_mask_ (concrete): (1, 29)
[2024-01-25 16:10:26,534] [24/1] torch._dynamo.output_graph.__graph_sizes: [DEBUG]
[2024-01-25 16:10:26,534] [24/1] torch._dynamo.output_graph: [INFO] Step 2: calling compiler function openvino
Compiling...
Generating...
Traceback (most recent call last):
File "/var/mnt/data/podman/CogVLM/test2.py", line 36, in <module>
output = model.generate(**inputs, max_new_tokens=200)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/data/podman/.conda/envs/cogvlm/lib/python3.11/site-packages/torch/_dynamo/eval_frame.py", line 328, in _fn
return fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File "/mnt/data/podman/.conda/envs/cogvlm/lib/python3.11/site-packages/torch/_dynamo/external_utils.py", line 17, in inner
return fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File "/mnt/data/podman/.conda/envs/cogvlm/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/mnt/data/podman/.conda/envs/cogvlm/lib/python3.11/site-packages/transformers/generation/utils.py", line 1173, in generate
@torch.no_grad()
File "/mnt/data/podman/.conda/envs/cogvlm/lib/python3.11/site-packages/transformers/generation/utils.py", line 1279, in <resume in generate>
self._validate_model_class()
File "/mnt/data/podman/.conda/envs/cogvlm/lib/python3.11/site-packages/transformers/generation/utils.py", line 1290, in <resume in generate>
and self.generation_config._original_object_hash == hash(self.generation_config)
File "/mnt/data/podman/.conda/envs/cogvlm/lib/python3.11/site-packages/transformers/generation/utils.py", line 1291, in <resume in generate>
and self.config._has_non_default_generation_parameters()
File "/mnt/data/podman/.conda/envs/cogvlm/lib/python3.11/site-packages/transformers/generation/utils.py", line 1304, in <resume in generate>
generation_config = copy.deepcopy(generation_config)
File "/mnt/data/podman/.conda/envs/cogvlm/lib/python3.11/site-packages/transformers/generation/utils.py", line 1307, in <resume in generate>
self._validate_model_kwargs(model_kwargs.copy())
File "/mnt/data/podman/.conda/envs/cogvlm/lib/python3.11/site-packages/transformers/generation/utils.py", line 1307, in <resume in generate>
self._validate_model_kwargs(model_kwargs.copy())
File "/mnt/data/podman/.conda/envs/cogvlm/lib/python3.11/site-packages/transformers/generation/utils.py", line 1310, in <resume in generate>
logits_processor = logits_processor if logits_processor is not None else LogitsProcessorList()
File "/mnt/data/podman/.conda/envs/cogvlm/lib/python3.11/site-packages/transformers/generation/utils.py", line 1479, in <resume in generate>
return self.greedy_search(
^^^^^^^^^^^^^^^^^^^
File "/mnt/data/podman/.conda/envs/cogvlm/lib/python3.11/site-packages/transformers/generation/utils.py", line 2337, in greedy_search
model_inputs = self.prepare_inputs_for_generation(input_ids, **model_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/data/podman/.conda/envs/cogvlm/lib/python3.11/site-packages/torch/_dynamo/eval_frame.py", line 490, in catch_errors
return callback(frame, cache_entry, hooks, frame_state)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/data/podman/.conda/envs/cogvlm/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 641, in _convert_frame
result = inner_convert(frame, cache_size, hooks, frame_state)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/data/podman/.conda/envs/cogvlm/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 133, in _fn
return fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File "/mnt/data/podman/.conda/envs/cogvlm/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 389, in _convert_frame_assert
return _compile(
^^^^^^^^^
File "/mnt/data/podman/.conda/envs/cogvlm/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 569, in _compile
guarded_code = compile_inner(code, one_graph, hooks, transform)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/data/podman/.conda/envs/cogvlm/lib/python3.11/site-packages/torch/_dynamo/utils.py", line 189, in time_wrapper
r = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/mnt/data/podman/.conda/envs/cogvlm/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 491, in compile_inner
out_code = transform_code_object(code, transform)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/data/podman/.conda/envs/cogvlm/lib/python3.11/site-packages/torch/_dynamo/bytecode_transformation.py", line 1028, in transform_code_object
transformations(instructions, code_options)
File "/mnt/data/podman/.conda/envs/cogvlm/lib/python3.11/site-packages/torch/_dynamo/convert_frame.py", line 458, in transform
tracer.run()
File "/mnt/data/podman/.conda/envs/cogvlm/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 2069, in run
super().run()
File "/mnt/data/podman/.conda/envs/cogvlm/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 719, in run
and self.step()
^^^^^^^^^^^
File "/mnt/data/podman/.conda/envs/cogvlm/lib/python3.11/site-packages/torch/_dynamo/symbolic_convert.py", line 697, in step
self.output.compile_subgraph(
File "/mnt/data/podman/.conda/envs/cogvlm/lib/python3.11/site-packages/torch/_dynamo/output_graph.py", line 857, in compile_subgraph
self.compile_and_call_fx_graph(tx, pass2.graph_output_vars(), root)
File "/mnt/data/podman/.conda/envs/cogvlm/lib/python3.11/contextlib.py", line 81, in inner
return func(*args, **kwds)
^^^^^^^^^^^^^^^^^^^
File "/mnt/data/podman/.conda/envs/cogvlm/lib/python3.11/site-packages/torch/_dynamo/output_graph.py", line 957, in compile_and_call_fx_graph
compiled_fn = self.call_user_compiler(gm)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/data/podman/.conda/envs/cogvlm/lib/python3.11/site-packages/torch/_dynamo/utils.py", line 189, in time_wrapper
r = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/mnt/data/podman/.conda/envs/cogvlm/lib/python3.11/site-packages/torch/_dynamo/output_graph.py", line 1024, in call_user_compiler
raise BackendCompilerFailed(self.compiler_fn, e).with_traceback(
File "/mnt/data/podman/.conda/envs/cogvlm/lib/python3.11/site-packages/torch/_dynamo/output_graph.py", line 1009, in call_user_compiler
compiled_fn = compiler_fn(gm, self.example_inputs())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/data/podman/.conda/envs/cogvlm/lib/python3.11/site-packages/torch/_dynamo/repro/after_dynamo.py", line 117, in debug_wrapper
compiled_gm = compiler_fn(gm, example_inputs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/data/podman/.conda/envs/cogvlm/lib/python3.11/site-packages/torch/__init__.py", line 1607, in __call__
return self.compiler_fn(model_, inputs_, **self.kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/data/podman/.conda/envs/cogvlm/lib/python3.11/site-packages/torch/_dynamo/backends/common.py", line 95, in wrapper
return fn(model, inputs, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/data/podman/.conda/envs/cogvlm/lib/python3.11/site-packages/openvino/frontend/pytorch/torchdynamo/backend.py", line 49, in openvino
return fx_openvino(subgraph, example_inputs, options)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/data/podman/.conda/envs/cogvlm/lib/python3.11/site-packages/openvino/frontend/pytorch/torchdynamo/backend.py", line 156, in fx_openvino
return compile_fx(subgraph, example_inputs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/data/podman/.conda/envs/cogvlm/lib/python3.11/site-packages/torch/_inductor/compile_fx.py", line 1150, in compile_fx
return aot_autograd(
^^^^^^^^^^^^^
File "/mnt/data/podman/.conda/envs/cogvlm/lib/python3.11/site-packages/torch/_dynamo/backends/common.py", line 55, in compiler_fn
cg = aot_module_simplified(gm, example_inputs, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/data/podman/.conda/envs/cogvlm/lib/python3.11/site-packages/torch/_functorch/aot_autograd.py", line 3891, in aot_module_simplified
compiled_fn = create_aot_dispatcher_function(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/data/podman/.conda/envs/cogvlm/lib/python3.11/site-packages/torch/_dynamo/utils.py", line 189, in time_wrapper
r = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/mnt/data/podman/.conda/envs/cogvlm/lib/python3.11/site-packages/torch/_functorch/aot_autograd.py", line 3379, in create_aot_dispatcher_function
fw_metadata = run_functionalized_fw_and_collect_metadata(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/data/podman/.conda/envs/cogvlm/lib/python3.11/site-packages/torch/_functorch/aot_autograd.py", line 757, in inner
flat_f_outs = f(*flat_f_args)
^^^^^^^^^^^^^^^
File "/mnt/data/podman/.conda/envs/cogvlm/lib/python3.11/site-packages/torch/_functorch/aot_autograd.py", line 3496, in functional_call
out = Interpreter(mod).run(*args[params_len:], **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/data/podman/.conda/envs/cogvlm/lib/python3.11/site-packages/torch/fx/interpreter.py", line 138, in run
self.env[node] = self.run_node(node)
^^^^^^^^^^^^^^^^^^^
File "/mnt/data/podman/.conda/envs/cogvlm/lib/python3.11/site-packages/torch/fx/interpreter.py", line 195, in run_node
return getattr(self, n.op)(n.target, args, kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/data/podman/.conda/envs/cogvlm/lib/python3.11/site-packages/torch/fx/interpreter.py", line 289, in call_method
return getattr(self_obj, target)(*args_tail, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^
torch._dynamo.exc.BackendCompilerFailed: backend='openvino' raised:
AttributeError: 'SymInt' object has no attribute 'size'
While executing %size : [num_users=1] = call_method[target=size](args = (%l_attention_mask_,), kwargs = {})
Original traceback:
File "/mnt/data/podman/.conda/envs/cogvlm/lib/python3.11/site-packages/transformers/models/llava/modeling_llava.py", line 524, in prepare_inputs_for_generation
if attention_mask is not None and attention_mask.shape[1] > input_ids.shape[1]:
You can suppress this exception and fall back to eager by setting:
import torch._dynamo
torch._dynamo.config.suppress_errors = True
[2024-01-25 16:10:27,541] torch._dynamo.utils: [INFO] TorchDynamo compilation metrics:
[2024-01-25 16:10:27,541] torch._dynamo.utils: [INFO] Function Runtimes (s)
[2024-01-25 16:10:27,541] torch._dynamo.utils: [INFO] ------------------------------- --------------
[2024-01-25 16:10:27,541] torch._dynamo.utils: [INFO] _compile.<locals>.compile_inner 396.586
[2024-01-25 16:10:27,541] torch._dynamo.utils: [INFO] OutputGraph.call_user_compiler 303.429
[2024-01-25 16:10:27,541] torch._dynamo.utils: [INFO] create_aot_dispatcher_function
software versions:
python 3.11.7 (conda)
openvino 2023.3.0
transformers 4.37.0
optimum 1.16.2
optimum_intel 1.12.4
torch 2.1.2
oneapi basekit 2024.0
hardware versions:
Intel Core i5-6500
Intel ARC A770 16GB
64GB RAM + 1T SWAP on SSD
Step-by-step reproduction
No response
Relevant log output
No response
Issue submission checklist
- [X] I'm reporting an issue. It's not a question.
- [X] I checked the problem with the documentation, FAQ, open issues, Stack Overflow, etc., and have not found a solution.
- [X] There is reproducer code and related data files such as images, videos, models, etc.
Looks like issue with torch.compile itself, @cavusmustafa could you help here?
One of the torch dynamo partitions seem to be failing while handling the symbolic inputs. Needs more debugging to provide a sufficient fix.
@cavusmustafa any updates here?
We are planning to enable new LLM features with the next release. As part of the updates, we are working on a fix for this issue as well.
@cavusmustafa are there any updates on this?
I am facing the same issue when trying to compile tinyllama-1.1b-step-50k-105b with openvino backend.
Ref. 132028
@cavusmustafa are there any updates on this? I am facing the same issue when trying to compile
tinyllama-1.1b-step-50k-105bwith openvino backend.
@anzr299 sorry for the delay, is it possible to share the full script to reproduce the issue?
Hi @anzr299 could you provide the scrip to reproduce the issue?
This issue will be closed in a week because of 9 months of no activity.
This issue was closed because it has been stalled for 9 months with no activity.