Describe the bug
What the bug is, and how to reproduce, better with screenshots(描述bug以及复现过程,最好有截图)
/home/fangmc/anaconda3/envs/swift/lib/python3.11/site-packages/deepspeed/runtime/zero/linear.py:47: FutureWarning: torch.cuda.amp.custom_fwd(args...)
is deprecated. Please use torch.amp.custom_fwd(args..., device_type='cuda')
instead.
@autocast_custom_fwd
/home/fangmc/anaconda3/envs/swift/lib/python3.11/site-packages/deepspeed/runtime/zero/linear.py:66: FutureWarning: torch.cuda.amp.custom_bwd(args...)
is deprecated. Please use torch.amp.custom_bwd(args..., device_type='cuda')
instead.
@autocast_custom_bwd
Train: 0%| | 0/158 [00:00<?, ?it/s]/home/fangmc/anaconda3/envs/swift/lib/python3.11/site-packages/torch/utils/checkpoint.py:1399: FutureWarning: torch.cpu.amp.autocast(args...)
is deprecated. Please use torch.amp.autocast('cpu', args...)
instead.
with device_autocast_ctx, torch.cpu.amp.autocast(**cpu_autocast_kwargs), recompute_context: # type: ignore[attr-defined]
Traceback (most recent call last):
File "/home/fangmc/code/project/swift/swift/cli/sft.py", line 5, in
sft_main()
File "/home/fangmc/code/project/swift/swift/utils/run_utils.py", line 32, in x_main
result = llm_x(args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/fangmc/code/project/swift/swift/llm/sft.py", line 405, in llm_sft
trainer.train(training_args.resume_from_checkpoint)
File "/home/fangmc/code/project/swift/swift/trainers/mixin.py", line 538, in train
res = super().train(resume_from_checkpoint, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/fangmc/anaconda3/envs/swift/lib/python3.11/site-packages/transformers/trainer.py", line 1948, in train
return inner_training_loop(
^^^^^^^^^^^^^^^^^^^^
File "/home/fangmc/anaconda3/envs/swift/lib/python3.11/site-packages/transformers/trainer.py", line 2289, in _inner_training_loop
tr_loss_step = self.training_step(model, inputs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/fangmc/anaconda3/envs/swift/lib/python3.11/site-packages/transformers/trainer.py", line 3328, in training_step
loss = self.compute_loss(model, inputs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/fangmc/code/project/swift/swift/trainers/trainers.py", line 179, in compute_loss
outputs = model(**inputs)
^^^^^^^^^^^^^^^
File "/home/fangmc/anaconda3/envs/swift/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/fangmc/anaconda3/envs/swift/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1603, in _call_impl
result = forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/fangmc/anaconda3/envs/swift/lib/python3.11/site-packages/accelerate/utils/operations.py", line 819, in forward
return model_forward(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/fangmc/anaconda3/envs/swift/lib/python3.11/site-packages/accelerate/utils/operations.py", line 807, in call
return convert_to_fp32(self.model_forward(*args, **kwargs))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/fangmc/anaconda3/envs/swift/lib/python3.11/site-packages/torch/amp/autocast_mode.py", line 43, in decorate_autocast
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/fangmc/anaconda3/envs/swift/lib/python3.11/site-packages/peft/peft_model.py", line 1430, in forward
return self.base_model(
^^^^^^^^^^^^^^^^
File "/home/fangmc/anaconda3/envs/swift/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/fangmc/anaconda3/envs/swift/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/fangmc/anaconda3/envs/swift/lib/python3.11/site-packages/peft/tuners/tuners_utils.py", line 179, in forward
return self.model.forward(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/fangmc/anaconda3/envs/swift/lib/python3.11/site-packages/accelerate/hooks.py", line 169, in new_forward
output = module._old_forward(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/fangmc/.cache/huggingface/modules/transformers_modules/glm-4v-9b/modeling_chatglm.py", line 1198, in forward
boi_token_pos, eoi_token_pos = input_id.index(self.config.boi_token_id), input_id.index(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: 151339 is not in list
Train: 0%|
Your hardware and system info
Write your system info like CUDA version/system/GPU/torch version here(在这里给出硬件信息和系统信息,如CUDA版本,系统,GPU型号和torch版本等)
ms-swift版本:2.4.0
GPU: A100
torch版本: 2.4.0
Additional context
Add any other context about the problem here(在这里补充其他信息)