InternLM-XComposer
InternLM-XComposer copied to clipboard
Tensor with negative dimensions / overflow error using Accelerate
When I try to use the internlm/internlm-xcomposer2-vl-1_8b
using 2 gpus, I am getting an error when using Accelerate with this usual line:
model = accelerator.prepare(model)
This is the error:
[rank1]: model = accelerator.prepare(model)
[rank1]: File "C:\Users\lucas\AppData\Local\Programs\Python\Python310\lib\site-packages\accelerate\accelerator.py", line 1274, in prepare
[rank1]: result = tuple(
[rank1]: File "C:\Users\lucas\AppData\Local\Programs\Python\Python310\lib\site-packages\accelerate\accelerator.py", line 1275, in <genexpr>
[rank1]: self._prepare_one(obj, first_pass=True, device_placement=d) for obj, d in zip(args, device_placement)
[rank1]: File "C:\Users\lucas\AppData\Local\Programs\Python\Python310\lib\site-packages\accelerate\accelerator.py", line 1151, in _prepare_one
[rank1]: return self.prepare_model(obj, device_placement=device_placement)
[rank1]: File "C:\Users\lucas\AppData\Local\Programs\Python\Python310\lib\site-packages\accelerate\accelerator.py", line 1403, in prepare_model
[rank1]: model = torch.nn.parallel.DistributedDataParallel(
[rank1]: File "C:\Users\lucas\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\parallel\distributed.py", line 812, in __init__
[rank1]: self._ddp_init_helper(
[rank1]: File "C:\Users\lucas\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\parallel\distributed.py", line 1152, in _ddp_init_helper
[rank1]: self.reducer = dist.Reducer(
[rank1]: RuntimeError: Trying to create tensor with negative dimension -2146648064: [-2146648064]
I've seen code examples where a single model is loaded on 2 separate gpus, but what I want to do is run two simulaneous processes using the internlm/internlm-xcomposer2-vl-1_8b
model. In my setup I have 2 RTX4070 cards which I want to run in 2 separate processes which share a dataloader.