DeepSpeed
DeepSpeed copied to clipboard
If you want to add arguments to the training such as the ones you list above (e.g., --gradient_checkpointing) you'll need to add them after `main.py` in the script for example:
If you want to add arguments to the training such as the ones you list above (e.g., --gradient_checkpointing) you'll need to add them after `main.py` in the script for example:
https://github.com/microsoft/DeepSpeedExamples/blob/2aa7a31b8fdcb34b8ccdc554021a1f5789752ab3/applications/DeepSpeed-Chat/training/step1_supervised_finetuning/training_scripts/single_gpu/run_1.3b.sh#L18-L20
deepspeed --num_gpus 1 main.py --gradient_checkpointing --model_name_or_path ...
The error you are showing is coming from our launcher not recognizing these arguments since they are intended to be consumed by main.py. Hope this helps.
Originally posted by @jeffra in https://github.com/microsoft/DeepSpeed/issues/3222#issuecomment-1513589597
Ok, thank you very much for your answer. However, I still have a few questions I would like to ask. Regarding your answer, my understanding is that I need to add a statement such as --gradient_checkpointing after the deepspeed --num_gpus 1 main.py statement in the run_1.3b.sh file to make this file into this form: deepspeed --num_gpus 1 main.py --gradient_checkpointing --model_name_or_path .... So, what parameters should be written after this --gradient_checkpointing? And, what statements should I add to make this project run? The above is my conversion from Chinese to English with translation software, I hope there is no misunderstanding.
Hi @TinyQi - did you mean to make this a new issue? This looks like it is still being covered in the original linked issue.
@khai0617 - what parameters you include would depend on what you are trying to do.
Closing this issue for preference to track in the main issue. If you need this one re-opened with a different question, please re-open.