runtime error when executing the default example
Describe the bug
after pip install screenai a runtime error is produced in the from screenai.main import ScreenAI line in the default example :
RuntimeError: mat1 and mat2 shapes cannot be multiplied (512x4 and 512x512)
To Reproduce Steps to reproduce the behavior:
- run
pip install screenai - run the default example
Expected behavior run without error
Screenshots `--------------------------------------------------------------------------- RuntimeError Traceback (most recent call last) ~\AppData\Local\Temp\ipykernel_20976\3292023021.py in <cell line: 2>() 1 import torch ----> 2 from screenai.main import ScreenAI 3 4 # Create a tensor for the image 5 image = torch.rand(1, 3, 224, 224)
~\AppData\Local\Programs\Python\Python39\lib\site-packages\screenai_init_.py in
~\AppData\Local\Programs\Python\Python39\lib\site-packages\screenai\main.py in
~\AppData\Local\Programs\Python\Python39\lib\site-packages\zeta_init_.py in
~\AppData\Local\Programs\Python\Python39\lib\site-packages\zeta\nn_init_.py in
~\AppData\Local\Programs\Python\Python39\lib\site-packages\zeta\nn\modules_init_.py in
~\AppData\Local\Programs\Python\Python39\lib\site-packages\zeta\nn\modules\mlp_mixer.py in
~\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\nn\modules\module.py in _wrapped_call_impl(self, *args, **kwargs) 1516 return self._compiled_call_impl(*args, **kwargs) # type: ignore[misc] 1517 else: -> 1518 return self._call_impl(*args, **kwargs) 1519 1520 def _call_impl(self, *args, **kwargs):
~\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\nn\modules\module.py in _call_impl(self, *args, **kwargs) 1525 or _global_backward_pre_hooks or _global_backward_hooks 1526 or _global_forward_hooks or _global_forward_pre_hooks): -> 1527 return forward_call(*args, **kwargs) 1528 1529 try:
~\AppData\Local\Programs\Python\Python39\lib\site-packages\zeta\nn\modules\mlp_mixer.py in forward(self, x) 123 x = rearrange(x, "n c h w -> n (h w) c") 124 for mixer_block in self.mixer_blocks: --> 125 x = mixer_block(x) 126 x = self.pred_head_layernorm(x) 127 x = x.mean(dim=1)
~\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\nn\modules\module.py in _wrapped_call_impl(self, *args, **kwargs) 1516 return self._compiled_call_impl(*args, **kwargs) # type: ignore[misc] 1517 else: -> 1518 return self._call_impl(*args, **kwargs) 1519 1520 def _call_impl(self, *args, **kwargs):
~\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\nn\modules\module.py in _call_impl(self, *args, **kwargs) 1525 or _global_backward_pre_hooks or _global_backward_hooks 1526 or _global_forward_hooks or _global_forward_pre_hooks): -> 1527 return forward_call(*args, **kwargs) 1528 1529 try:
~\AppData\Local\Programs\Python\Python39\lib\site-packages\zeta\nn\modules\mlp_mixer.py in forward(self, x) 61 y = self.norm1(x) 62 y = rearrange(y, "n c t -> n t c") ---> 63 y = self.tokens_mlp(y) 64 y = rearrange(y, "n t c -> n c t") 65 x = x + y
~\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\nn\modules\module.py in _wrapped_call_impl(self, *args, **kwargs) 1516 return self._compiled_call_impl(*args, **kwargs) # type: ignore[misc] 1517 else: -> 1518 return self._call_impl(*args, **kwargs) 1519 1520 def _call_impl(self, *args, **kwargs):
~\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\nn\modules\module.py in _call_impl(self, *args, **kwargs) 1525 or _global_backward_pre_hooks or _global_backward_hooks 1526 or _global_forward_hooks or _global_forward_pre_hooks): -> 1527 return forward_call(*args, **kwargs) 1528 1529 try:
~\AppData\Local\Programs\Python\Python39\lib\site-packages\zeta\nn\modules\mlp_mixer.py in forward(self, x) 28 torch.Tensor: description 29 """ ---> 30 y = self.dense1(x) 31 y = F.gelu(y) 32 return self.dense2(y)
~\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\nn\modules\module.py in _wrapped_call_impl(self, *args, **kwargs) 1516 return self._compiled_call_impl(*args, **kwargs) # type: ignore[misc] 1517 else: -> 1518 return self._call_impl(*args, **kwargs) 1519 1520 def _call_impl(self, *args, **kwargs):
~\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\nn\modules\module.py in _call_impl(self, *args, **kwargs) 1525 or _global_backward_pre_hooks or _global_backward_hooks 1526 or _global_forward_hooks or _global_forward_pre_hooks): -> 1527 return forward_call(*args, **kwargs) 1528 1529 try:
~\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\nn\modules\linear.py in forward(self, input) 112 113 def forward(self, input: Tensor) -> Tensor: --> 114 return F.linear(input, self.weight, self.bias) 115 116 def extra_repr(self) -> str:
RuntimeError: mat1 and mat2 shapes cannot be multiplied (512x4 and 512x512)`
Upvote & Fund
- We're using Polar.sh so you can upvote and help fund this issue.
- We receive the funding once the issue is completed & confirmed by you.
- Thank you in advance for helping prioritize & fund our backlog.
Hello there, thank you for opening an Issue ! 🙏🏻 The team was notified and they will get back to you asap.
Got that same issue on a Mac M1
Same here Intel Mac
Same with mac M2
Same with windows11&wsl
same here with windows10
Edit: I solved this issue by pip uninstall zetascale and reinstall with pip install zetascale In my case it installed an ancient version 0.9.xyz and after I installed the newest version 2.2.7 it worked
@kyegomez maybe it would be good to update the README example with the actual example from the example.py after solving this issue I got more issue because
- there was no
num_tokensdefined - there was no
max_seq_lendefined - image and text were not initialized with the right dimensions
Another question I've got is, how did you choose num_tokens and max_seq_len?
Stale issue message
Same with Linux Ubuntu 24 LTS
After my modifications, I got a working code, and I hope it solves your problem.
import torch
from screenai.main import ScreenAI
# 创建图像张量
image = torch.rand(1, 3, 224, 224)
# 创建 ScreenAI 模型的实例
model = ScreenAI(
num_tokens=2000,
max_seq_len=1024,
patch_size=16,
image_size=224,
dim=512,
depth=6,
heads=8,
vit_depth=4,
multi_modal_encoder_depth=4,
llm_decoder_depth=4,
mm_encoder_ff_mult=4,
)
# 假设您的文本已经被转换为词索引,这里我们使用随机整数来模拟
# num_tokens 是您的词汇表大小,max_seq_len 是模型能够处理的最大序列长度
text_indices = torch.randint(0, model.num_tokens, (1, model.max_seq_len))
# 将文本索引张量转换为长整型张量
text = text_indices.long()
# 使用给定的文本和图像张量进行模型的正向传播
out = model(text, image)
# 打印输出张量的形状
print(out)
Stale issue message
same here with windows10
Edit: I solved this issue by
pip uninstall zetascaleand reinstall withpip install zetascaleIn my case it installed an ancient version0.9.xyzand after I installed the newest version2.2.7it worked@kyegomez maybe it would be good to update the README example with the actual example from the example.py after solving this issue I got more issue because
there was no
num_tokensdefinedthere was no
max_seq_lendefinedimage and text were not initialized with the right dimensions
Another question I've got is, how did you choose
num_tokensandmax_seq_len?
The max seq Len and the number of tokens is the max vocabulary of the tokenizer and the max seq length is typically 4096 8192 16092, etc 32042
And
Stale issue message