InternLM-XComposer icon indicating copy to clipboard operation
InternLM-XComposer copied to clipboard

Tensor with negative dimensions / overflow error using Accelerate

Open frutiemax92 opened this issue 7 months ago • 0 comments

When I try to use the internlm/internlm-xcomposer2-vl-1_8b using 2 gpus, I am getting an error when using Accelerate with this usual line: model = accelerator.prepare(model)

This is the error:

[rank1]:     model = accelerator.prepare(model)
[rank1]:   File "C:\Users\lucas\AppData\Local\Programs\Python\Python310\lib\site-packages\accelerate\accelerator.py", line 1274, in prepare
[rank1]:     result = tuple(
[rank1]:   File "C:\Users\lucas\AppData\Local\Programs\Python\Python310\lib\site-packages\accelerate\accelerator.py", line 1275, in <genexpr>
[rank1]:     self._prepare_one(obj, first_pass=True, device_placement=d) for obj, d in zip(args, device_placement)
[rank1]:   File "C:\Users\lucas\AppData\Local\Programs\Python\Python310\lib\site-packages\accelerate\accelerator.py", line 1151, in _prepare_one
[rank1]:     return self.prepare_model(obj, device_placement=device_placement)
[rank1]:   File "C:\Users\lucas\AppData\Local\Programs\Python\Python310\lib\site-packages\accelerate\accelerator.py", line 1403, in prepare_model
[rank1]:     model = torch.nn.parallel.DistributedDataParallel(
[rank1]:   File "C:\Users\lucas\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\parallel\distributed.py", line 812, in __init__
[rank1]:     self._ddp_init_helper(
[rank1]:   File "C:\Users\lucas\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\nn\parallel\distributed.py", line 1152, in _ddp_init_helper
[rank1]:     self.reducer = dist.Reducer(
[rank1]: RuntimeError: Trying to create tensor with negative dimension -2146648064: [-2146648064]

I've seen code examples where a single model is loaded on 2 separate gpus, but what I want to do is run two simulaneous processes using the internlm/internlm-xcomposer2-vl-1_8b model. In my setup I have 2 RTX4070 cards which I want to run in 2 separate processes which share a dataloader.

frutiemax92 avatar Jul 06 '24 14:07 frutiemax92