xtuner icon indicating copy to clipboard operation
xtuner copied to clipboard

RuntimeError: The size of tensor a (0) must match the size of tensor b (592) at non-singleton dimension 0

Open CYing18 opened this issue 1 year ago • 6 comments

07/13 01:19:43 - mmengine - INFO - before_train in EvaluateChatHook. You are using an old version of the checkpointing format that is deprecated (We will also silently ignore gradient_checkpointing_kwargs in case you passed it).Please update to the new format on your modeling file. To use the new format, you need to completely remove the definition of the method _set_gradient_checkpointing in your model. The attention mask is not set and cannot be inferred from input because pad token is same as eos token.As a consequence, you may observe unexpected behavior. Please pass your input's attention_mask to obtain reliable results. [rank0]: Traceback (most recent call last): [rank0]: File "/mnt/petrelfs/chenying1/project/xtuner/xtuner/tools/train.py", line 360, in [rank0]: main() [rank0]: File "/mnt/petrelfs/chenying1/project/xtuner/xtuner/tools/train.py", line 356, in main [rank0]: runner.train() [rank0]: File "/mnt/petrelfs/chenying1/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/mmengine/runner/_flexible_runner.py", line 1200, in train [rank0]: model = self.train_loop.run() # type: ignore [rank0]: File "/mnt/petrelfs/chenying1/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/mmengine/runner/loops.py", line 271, in run [rank0]: self.runner.call_hook('before_train') [rank0]: File "/mnt/petrelfs/chenying1/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/mmengine/runner/_flexible_runner.py", line 1271, in call_hook [rank0]: getattr(hook, fn_name)(self, **kwargs) [rank0]: File "/mnt/petrelfs/chenying1/project/xtuner/xtuner/engine/hooks/evaluate_chat_hook.py", line 234, in before_train [rank0]: self._generate_samples(runner, max_new_tokens=50) [rank0]: File "/mnt/petrelfs/chenying1/project/xtuner/xtuner/engine/hooks/evaluate_chat_hook.py", line 220, in _generate_samples [rank0]: self._eval_images(runner, model, device, max_new_tokens, [rank0]: File "/mnt/petrelfs/chenying1/project/xtuner/xtuner/engine/hooks/evaluate_chat_hook.py", line 152, in _eval_images [rank0]: generation_output = model.generate( [rank0]: File "/mnt/petrelfs/chenying1/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context [rank0]: return func(*args, **kwargs) [rank0]: File "/mnt/petrelfs/chenying1/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/transformers/generation/utils.py", line 1914, in generate [rank0]: result = self._sample( [rank0]: File "/mnt/petrelfs/chenying1/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/transformers/generation/utils.py", line 2651, in _sample [rank0]: outputs = self( [rank0]: File "/mnt/petrelfs/chenying1/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl [rank0]: return self._call_impl(*args, **kwargs) [rank0]: File "/mnt/petrelfs/chenying1/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl [rank0]: return forward_call(*args, **kwargs) [rank0]: File "/mnt/petrelfs/chenying1/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/accelerate/hooks.py", line 169, in new_forward [rank0]: output = module._old_forward(*args, **kwargs) [rank0]: File "/mnt/petrelfs/chenying1/.cache/huggingface/modules/transformers_modules/internlm/internlm2-chat-7b/70e6cdc9643ce7e3d9a369fb984dc5f1a1b2cec6/modeling_internlm2.py", line 1204, in forward [rank0]: outputs = self.model( [rank0]: File "/mnt/petrelfs/chenying1/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl [rank0]: return self._call_impl(*args, **kwargs) [rank0]: File "/mnt/petrelfs/chenying1/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl [rank0]: return forward_call(*args, **kwargs) [rank0]: File "/mnt/petrelfs/chenying1/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/accelerate/hooks.py", line 169, in new_forward [rank0]: output = module._old_forward(*args, **kwargs) [rank0]: File "/mnt/petrelfs/chenying1/.cache/huggingface/modules/transformers_modules/internlm/internlm2-chat-7b/70e6cdc9643ce7e3d9a369fb984dc5f1a1b2cec6/modeling_internlm2.py", line 976, in forward [rank0]: causal_mask = self._update_causal_mask( [rank0]: File "/mnt/petrelfs/chenying1/.cache/huggingface/modules/transformers_modules/internlm/internlm2-chat-7b/70e6cdc9643ce7e3d9a369fb984dc5f1a1b2cec6/modeling_internlm2.py", line 1097, in _update_causal_mask [rank0]: causal_mask *= torch.arange(target_length, device=device) > cache_position.reshape(-1, 1) [rank0]: RuntimeError: The size of tensor a (0) must match the size of tensor b (592) at non-singleton dimension 0 [WARNING] async_io requires the dev libaio .so object and headers but these were not found. [WARNING] If libaio is already installed (perhaps from source), try setting the CFLAGS and LDFLAGS environment variables to where it can be found. [WARNING] Please specify the CUTLASS repo directory as environment variable $CUTLASS_PATH [WARNING] sparse_attn requires a torch version >= 1.5 and < 2.0 but detected 2.3 [WARNING] using untested triton version (2.3.1), only 1.0.0 is known to be compatible

CYing18 avatar Jul 12 '24 17:07 CYing18

我也遇到了这个问题,请问您解决了吗?

XiaoMan1117 avatar Jul 26 '24 07:07 XiaoMan1117

@CYing18

XiaoMan1117 avatar Jul 27 '24 09:07 XiaoMan1117

@CYing18

please update transformer package version :)

TousenKaname avatar Sep 14 '24 16:09 TousenKaname

i have got the same problem, did anyone fix it? Please let me know.

WHUzhouyh avatar Sep 28 '24 15:09 WHUzhouyh

@TousenKaname I upgrade transformer to 4.47.1, always got same problem,anyone know how to fix it, please help me, thx!

kaiwuhuang avatar Dec 30 '24 08:12 kaiwuhuang

@TousenKaname I upgrade transformer to 4.47.1, always got same problem,anyone know how to fix it, please help me, thx!

My transformers version is 4.40.0. Could you run successfully?

TousenKaname avatar Dec 30 '24 13:12 TousenKaname