DeepSpeedExamples
DeepSpeedExamples copied to clipboard
Example models using DeepSpeed
I replaced the model in steps 1 and 2 with a GPT-2 model: [IDEA-CCNL/Wenzhong-GPT2-110M](https://huggingface.co/IDEA-CCNL/Wenzhong-GPT2-110M). Then use Zero-3 for training, the command is as follows: ``` python train.py --actor-zero-stage 3 --actor-model...
I'm very interested in the new features that have been announced. Can you please provide us with some information on when we can expect "System support and finetuning for LLaMA"...
This PR is used to demonstrate the functionality of snip_momentum structured pruning algo implemented in [here](https://github.com/microsoft/DeepSpeed/pull/3300). User can reproduce below result by running `source ./bash_script/pruning_sparse_snip_momentum.sh` with the PR mentioned at...
### A quick fix for bugs I see when go through the code 1. Wrong scores calculation in step2 reward model training It might related to issue334 [https://github.com/microsoft/DeepSpeedExamples/issues/334](url) 2. Wrongly...
1. Select data for better convergence 2. Make dropout as an option 3. Increase step-1 training epochs 4. Script updates 5. Other things
Hello, I‘m tring to use BLOOMZ for reward model training, and get error: ``` Traceback (most recent call last): File "/users5/xydu/ChatGPT/DeepSpeed-Chat/training/step2_reward_model_finetuning/training_scripts/single_node/../../main.py", line 349, in main() File "/users5/xydu/ChatGPT/DeepSpeed-Chat/training/step2_reward_model_finetuning/training_scripts/single_node/../../main.py", line 303, in...
   My card is v100, but I get this error when running the training script for step1
hi,dear `python chat.py --path ${PATH-to-your-actor-model}` if I want use opt-1.3b model, how ?