victorzhz111 issues

Repositories
Issues
Comments

Results 2 issues of


                                            victorzhz111

evaluation loss is NaN

When I finetuning the alpaca-alora model, I applied the alora modules on attention layers" {q_proj, v_proj}", and received the evaluation loss as NaN. However, if I applied the alora moduyles...

adapter_name_or_path 继续训练sft的adapter

### Reminder - [X] I have read the README and searched the existing issues. ### Reproduction deepspeed --num_gpus 8 --master_port=9901 src/train_bash.py \ --deepspeed /mnt/workspace/hanzhong/LLaMA-Factory/train_sh/ds_config_zero3.json \ --stage dpo \ --do_train \...