Open-Sora
Open-Sora copied to clipboard
Shape mismatch error occurs in multiprocessing
If I use 2~ gpus on inference, following error occurs.
Traceback (most recent call last):
File "/hub_data1/minhyuk/diffusion/opensora/scripts/inference.py", line 114, in <module>
main()
File "/hub_data1/minhyuk/diffusion/opensora/scripts/inference.py", line 95, in main
samples = scheduler.sample(
File "/home/minhyuk/.conda/envs/opensora/lib/python3.10/site-packages/opensora/schedulers/iddpm/__init__.py", line 72, in sample
samples = self.p_sample_loop(
File "/home/minhyuk/.conda/envs/opensora/lib/python3.10/site-packages/opensora/schedulers/iddpm/gaussian_diffusion.py", line 434, in p_sample_loop
for sample in self.p_sample_loop_progressive(
File "/home/minhyuk/.conda/envs/opensora/lib/python3.10/site-packages/opensora/schedulers/iddpm/gaussian_diffusion.py", line 485, in p_sample_loop_p
rogressive
out = self.p_sample(
File "/home/minhyuk/.conda/envs/opensora/lib/python3.10/site-packages/opensora/schedulers/iddpm/gaussian_diffusion.py", line 388, in p_sample
out = self.p_mean_variance(
File "/home/minhyuk/.conda/envs/opensora/lib/python3.10/site-packages/opensora/schedulers/iddpm/respace.py", line 94, in p_mean_variance
return super().p_mean_variance(self._wrap_model(model), *args, **kwargs)
File "/home/minhyuk/.conda/envs/opensora/lib/python3.10/site-packages/opensora/schedulers/iddpm/gaussian_diffusion.py", line 267, in p_mean_variance
model_output = model(x, t, **model_kwargs)
File "/home/minhyuk/.conda/envs/opensora/lib/python3.10/site-packages/opensora/schedulers/iddpm/respace.py", line 127, in __call__
return self.model(x, new_ts, **kwargs)
File "/home/minhyuk/.conda/envs/opensora/lib/python3.10/site-packages/opensora/schedulers/iddpm/__init__.py", line 89, in forward_with_cfg
model_out = model.forward(combined, timestep, y, **kwargs)
File "/home/minhyuk/.conda/envs/opensora/lib/python3.10/site-packages/opensora/models/stdit/stdit.py", line 267, in forward
x = auto_grad_checkpoint(block, x, y, t0, y_lens, tpe)
File "/home/minhyuk/.conda/envs/opensora/lib/python3.10/site-packages/opensora/acceleration/checkpoint.py", line 24, in auto_grad_checkpoint
return module(*args, **kwargs)
File "/home/minhyuk/.conda/envs/opensora/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/minhyuk/.conda/envs/opensora/lib/python3.10/site-packages/opensora/models/stdit/stdit.py", line 111, in forward
x = x + self.cross_attn(x, y, mask)
File "/home/minhyuk/.conda/envs/opensora/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/minhyuk/.conda/envs/opensora/lib/python3.10/site-packages/opensora/models/layers/blocks.py", line 313, in forward
kv = self.kv_linear(cond).view(B, -1, 2, self.num_heads, self.head_dim)
RuntimeError: shape '[4, -1, 2, 16, 72]' is invalid for input of size 105523
I tested on 2/3/4 gpus, and all give the same error.
我有相同的问题
"My solution is to change the 'batch_size' value on line 32 of 'configs/opensora/inference/16x512x512.py' to 1, and that resolves the error." 16x512x512.py