xtuner RuntimeError: The size of tensor a (0) must match the size of tensor b (592) at non-singleton dimension 0

07/13 01:19:43 - mmengine - INFO - before_train in EvaluateChatHook. You are using an old version of the checkpointing format that is deprecated (We will also silently ignore gradient_checkpointing_kwargs in case you passed it).Please update to the new format on your modeling file. To use the new format, you need to completely remove the definition of the method _set_gradient_checkpointing in your model. The attention mask is not set and cannot be inferred from input because pad token is same as eos token.As a consequence, you may observe unexpected behavior. Please pass your input's attention_mask to obtain reliable results. [rank0]: Traceback (most recent call last): [rank0]: File "/mnt/petrelfs/chenying1/project/xtuner/xtuner/tools/train.py", line 360, in [rank0]: main() [rank0]: File "/mnt/petrelfs/chenying1/project/xtuner/xtuner/tools/train.py", line 356, in main [rank0]: runner.train() [rank0]: File "/mnt/petrelfs/chenying1/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/mmengine/runner/_flexible_runner.py", line 1200, in train [rank0]: model = self.train_loop.run() # type: ignore [rank0]: File "/mnt/petrelfs/chenying1/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/mmengine/runner/loops.py", line 271, in run [rank0]: self.runner.call_hook('before_train') [rank0]: File "/mnt/petrelfs/chenying1/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/mmengine/runner/_flexible_runner.py", line 1271, in call_hook [rank0]: getattr(hook, fn_name)(self, **kwargs) [rank0]: File "/mnt/petrelfs/chenying1/project/xtuner/xtuner/engine/hooks/evaluate_chat_hook.py", line 234, in before_train [rank0]: self._generate_samples(runner, max_new_tokens=50) [rank0]: File "/mnt/petrelfs/chenying1/project/xtuner/xtuner/engine/hooks/evaluate_chat_hook.py", line 220, in _generate_samples [rank0]: self._eval_images(runner, model, device, max_new_tokens, [rank0]: File "/mnt/petrelfs/chenying1/project/xtuner/xtuner/engine/hooks/evaluate_chat_hook.py", line 152, in _eval_images [rank0]: generation_output = model.generate( [rank0]: File "/mnt/petrelfs/chenying1/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context [rank0]: return func(*args, **kwargs) [rank0]: File "/mnt/petrelfs/chenying1/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/transformers/generation/utils.py", line 1914, in generate [rank0]: result = self._sample( [rank0]: File "/mnt/petrelfs/chenying1/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/transformers/generation/utils.py", line 2651, in _sample [rank0]: outputs = self( [rank0]: File "/mnt/petrelfs/chenying1/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl [rank0]: return self._call_impl(*args, **kwargs) [rank0]: File "/mnt/petrelfs/chenying1/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl [rank0]: return forward_call(*args, **kwargs) [rank0]: File "/mnt/petrelfs/chenying1/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/accelerate/hooks.py", line 169, in new_forward [rank0]: output = module._old_forward(*args, **kwargs) [rank0]: File "/mnt/petrelfs/chenying1/.cache/huggingface/modules/transformers_modules/internlm/internlm2-chat-7b/70e6cdc9643ce7e3d9a369fb984dc5f1a1b2cec6/modeling_internlm2.py", line 1204, in forward [rank0]: outputs = self.model( [rank0]: File "/mnt/petrelfs/chenying1/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl [rank0]: return self._call_impl(*args, **kwargs) [rank0]: File "/mnt/petrelfs/chenying1/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl [rank0]: return forward_call(*args, **kwargs) [rank0]: File "/mnt/petrelfs/chenying1/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/accelerate/hooks.py", line 169, in new_forward [rank0]: output = module._old_forward(*args, **kwargs) [rank0]: File "/mnt/petrelfs/chenying1/.cache/huggingface/modules/transformers_modules/internlm/internlm2-chat-7b/70e6cdc9643ce7e3d9a369fb984dc5f1a1b2cec6/modeling_internlm2.py", line 976, in forward [rank0]: causal_mask = self._update_causal_mask( [rank0]: File "/mnt/petrelfs/chenying1/.cache/huggingface/modules/transformers_modules/internlm/internlm2-chat-7b/70e6cdc9643ce7e3d9a369fb984dc5f1a1b2cec6/modeling_internlm2.py", line 1097, in _update_causal_mask [rank0]: causal_mask *= torch.arange(target_length, device=device) > cache_position.reshape(-1, 1) [rank0]: RuntimeError: The size of tensor a (0) must match the size of tensor b (592) at non-singleton dimension 0 [WARNING] async_io requires the dev libaio .so object and headers but these were not found. [WARNING] If libaio is already installed (perhaps from source), try setting the CFLAGS and LDFLAGS environment variables to where it can be found. [WARNING] Please specify the CUTLASS repo directory as environment variable $CUTLASS_PATH [WARNING] sparse_attn requires a torch version >= 1.5 and < 2.0 but detected 2.3 [WARNING] using untested triton version (2.3.1), only 1.0.0 is known to be compatible