Gasol Sun
Gasol Sun
Change all absolute paths to relative paths of the current project,and change the path: `SAVED_DIR="${project_dir}/${DATADIR}/onmt_processed_data/with_copy_${data_type}_${processed_type}"` to `SAVED_DIR="${project_dir}/${DATADIR}/onmt_processed_dataset/with_copy_${data_type}_${processed_type}"` because the path used by the next few sh files is `onmt_processed_dataset`
### Reminder - [X] I have read the README and searched the existing issues. ### Reproduction accelerate launch --num_processes 4 --main_process_port 6678 src/train_bash.py \ --stage sft \ --model_name_or_path meta-llama/Llama-2-7b-hf \...
请问一下,baichuan可以用FSDP微调吗?如果可以的话,fsdp_transformer_layer_cls_to_wrap这一项应该是什么呢?
Hi, In step 3, run the following command and getting "OOM" when Initializing Ref Model (Actor Model initialized perfectly): > Actor_Lr=9.65e-6 Critic_Lr=5e-6 deepspeed --master_port 12346 main.py \ --data_path Dahoas/rm-static \...
Excellent work! However, in your paper page 19, Appendix D, you show automatically constructed demonstrations for GSM8K. However, I find that these 8 cases are from test.jsonl but not train.jsonl....
When eval was performed, I encountered a division by 0 problem. It seems that the author did not consider the case where a certain category was completely wrong.